Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] [query] use otel's implementation for constructing http and grpc servers #6055

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

mahadzaryab1
Copy link
Collaborator

Which problem is this PR solving?

Description of the changes

How was this change tested?

Checklist

Copy link

codecov bot commented Oct 5, 2024

Codecov Report

Attention: Patch coverage is 78.57143% with 21 lines in your changes missing coverage. Please review.

Project coverage is 96.34%. Comparing base (acbc0e1) to head (3231bfd).

Files with missing lines Patch % Lines
cmd/query/app/server.go 78.12% 18 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6055      +/-   ##
==========================================
- Coverage   96.46%   96.34%   -0.13%     
==========================================
  Files         352      352              
  Lines       19986    20045      +59     
==========================================
+ Hits        19280    19312      +32     
- Misses        522      544      +22     
- Partials      184      189       +5     
Flag Coverage Δ
badger_v1 8.42% <ø> (ø)
badger_v2 1.70% <ø> (ø)
cassandra-4.x-v1 14.57% <ø> (ø)
cassandra-4.x-v2 1.64% <ø> (ø)
cassandra-5.x-v1 14.57% <ø> (ø)
cassandra-5.x-v2 1.64% <ø> (ø)
elasticsearch-6.x-v1 18.72% <ø> (-0.02%) ⬇️
elasticsearch-7.x-v1 18.81% <ø> (+0.01%) ⬆️
elasticsearch-8.x-v1 18.98% <ø> (ø)
elasticsearch-8.x-v2 1.69% <ø> (-0.02%) ⬇️
grpc_v1 8.77% <ø> (ø)
grpc_v2 ?
kafka-v1 8.99% <ø> (ø)
kafka-v2 1.70% <ø> (ø)
memory_v2 1.70% <ø> (ø)
opensearch-1.x-v1 18.86% <ø> (+0.01%) ⬆️
opensearch-2.x-v1 18.85% <ø> (ø)
opensearch-2.x-v2 1.69% <ø> (ø)
tailsampling-processor 0.48% <ø> (ø)
unittests 95.26% <78.57%> (-0.13%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mahadzaryab1
Copy link
Collaborator Author

@yurishkuro do you have any thoughts on how to handle the sharing of ports problem?

@yurishkuro
Copy link
Member

We should follow our deprecation policy of two versions notice. It means for v2 we can go to new method and distinct port. For v1 we need to print a warning if the same port is used telling the user this will not be supported in the future (call it out in release notes). Then two releases later we switch default to not allow same port but still support it via a special flag (deprecated from start), and then two releases later remove that too.

This also means we should have another release tag "deprecated" and change release notes script to have a separate section for it.

@mahadzaryab1
Copy link
Collaborator Author

We should follow our deprecation policy of two versions notice. It means for v2 we can go to new method and distinct port. For v1 we need to print a warning if the same port is used telling the user this will not be supported in the future (call it out in release notes). Then two releases later we switch default to not allow same port but still support it via a special flag (deprecated from start), and then two releases later remove that too.

This also means we should have another release tag "deprecated" and change release notes script to have a separate section for it.

Got it! Let me make some changes.

Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
@mahadzaryab1
Copy link
Collaborator Author

@yurishkuro do you have any thoughts so far? it looks pretty messy but i'm not sure if there's a cleaner way to do this.

cmd/query/app/server.go Outdated Show resolved Hide resolved
func NewServer(querySvc *querysvc.QueryService,
func NewServer(
ctx context.Context,
host component.Host,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can wrap it into telemetery.Setting

return nil, err
}
} else {
telset.Logger.Error("using the same port for gRPC and HTTP is deprecated; please use dedicated host ports intead")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
telset.Logger.Error("using the same port for gRPC and HTTP is deprecated; please use dedicated host ports intead")
telset.Logger.Warning("using the same port for gRPC and HTTP is deprecated; please use dedicated ports instead")

var grpcServer *grpc.Server
var httpServer *httpServer
if separatePorts {
grpcServer, err = createGRPCServer(ctx, host, querySvc, metricsQuerySvc, options, tm, telset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is unnecessary bifurcation at this point. I would instead spit this function into two, create and registerEndpoints(server). Pass legacy=separatePorts argument to create

func createGRPCServer(legacy bool, ... {
  if !legacy {
    return otel.Create...
  }
  // legacy code
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro how would we register the endpoints after creating the server?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same way they are currently registered in the create() function. My point is that we're only changing how the server is created, not how endpoints are mounted into it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro otel's ToServer takes in a handler, which has the routes registered to it. So how would we register the routes after creating the server using ToServer?

Copy link
Member

@yurishkuro yurishkuro Oct 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so for grpc the order is (1) create server, (2) register handlers. For HTTP it's (1) create the root handler with all sub-handlers registered, and (2) create server. In both cases we can factor out the handlers creation, which are not different between v1/v2.

Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
@mahadzaryab1
Copy link
Collaborator Author

@yurishkuro do you know why the all in one test is failing by any chance?

mahadzaryab1 and others added 2 commits October 9, 2024 07:46
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
@yurishkuro
Copy link
Member

should have 1 item(s), but has 2

It seems some unexpected spans are being created. This may not necessary be a bad thing, i.e. we may need to adjust the test to allow that. I'd recommend running the test with commented out docker kill so that you can go and inspect the trace in Jaeger UI in order to understand what exactly is being created and whether we want it.

@mahadzaryab1
Copy link
Collaborator Author

should have 1 item(s), but has 2

It seems some unexpected spans are being created. This may not necessary be a bad thing, i.e. we may need to adjust the test to allow that. I'd recommend running the test with commented out docker kill so that you can go and inspect the trace in Jaeger UI in order to understand what exactly is being created and whether we want it.

@yurishkuro This is what I'm seeing in the jaeger UI. What're your thoughts?

Screenshot 2024-10-19 at 10 32 56 PM

@yurishkuro
Copy link
Member

The /api/traces request, is it a GET or a POST? Get is ok to trace, post is not.

@mahadzaryab1
Copy link
Collaborator Author

The /api/traces request, is it a GET or a POST? Get is ok to trace, post is not.

@yurishkuro It looks to be a GET request.

Screenshot 2024-10-20 at 1 25 19 PM

@yurishkuro
Copy link
Member

Ok, then how is it different from how the test worked before your change? Did we not have tracing on that endpoint?

@mahadzaryab1
Copy link
Collaborator Author

@yurishkuro It looks like the second trace that's the one that is new

@yurishkuro
Copy link
Member

I think we should improve selectivity of the test - it should not be breaking if some extra trace is created, its objective is to check that the trace it needs is created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[query] Switch to use OTEL's http/grpc servers
2 participants