-
Notifications
You must be signed in to change notification settings - Fork 4.6k
stats/otel: Add subchannel metrics (A94) #8738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #8738 +/- ##
==========================================
+ Coverage 83.22% 83.35% +0.13%
==========================================
Files 419 419
Lines 32454 32504 +50
==========================================
+ Hits 27009 27095 +86
+ Misses 4057 4020 -37
- Partials 1388 1389 +1
🚀 New features to boost your workflow:
|
|
A94 states the following: How are we currently handling this in the pickfirst metrics? |
|
A94 states the following: Can you please ensure that we have an issue filed to track the removal of the old metrics and that it captures the correct release where it needs to be removed. |
Here is how pickfirst handles this:
|
| Description: "EXPERIMENTAL. Number of open connections.", | ||
| Unit: "{attempt}", | ||
| Labels: []string{"grpc.target"}, | ||
| OptionalLabels: []string{"grpc.lb.backend_service", "grpc.security_level", "grpc.lb.locality"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if the order of these labels matters, but the order here is not consistent with the order specified in the gRFC.
| labels := make(map[string]string) | ||
| labels["grpc.lb.locality"] = locality | ||
| labels["grpc.lb.backend_service"] = cluster | ||
| return labels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this can be shorted using the literal initialization syntax in Go for maps:
return map[string]string {
"grpc.lb.locality": locality,
"grpc.lb.backend_service": cluster,
}| // address atrributes. | ||
| func AddressToTelemetryLabels(addr resolver.Address) map[string]string { | ||
| cluster, _ := GetXDSHandshakeClusterName(addr.Attributes) | ||
| locality := LocalityString(GetLocalityID(addr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who sets the locality attribute on the address? I see that there is a SetLocality function, but there are no callers of it.
| var locality, backendService string | ||
| labelsFromAddress, ok := internal.AddressToTelemetryLabels.(func(resolver.Address) map[string]string) | ||
| if len(ac.addrs) > 0 && internal.AddressToTelemetryLabels != nil && ok { | ||
| labels := labelsFromAddress(ac.addrs[0]) | ||
| locality = labels["grpc.lb.locality"] | ||
| backendService = labels["grpc.lb.backend_service"] | ||
| } | ||
| return locality, backendService |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also be shorted as follows:
| var locality, backendService string | |
| labelsFromAddress, ok := internal.AddressToTelemetryLabels.(func(resolver.Address) map[string]string) | |
| if len(ac.addrs) > 0 && internal.AddressToTelemetryLabels != nil && ok { | |
| labels := labelsFromAddress(ac.addrs[0]) | |
| locality = labels["grpc.lb.locality"] | |
| backendService = labels["grpc.lb.backend_service"] | |
| } | |
| return locality, backendService | |
| labelsFunc, ok := internal.AddressToTelemetryLabels.(func(resolver.Address) map[string]string) | |
| if !ok || len(ac.addrs) == 0 { | |
| return "", "" | |
| } | |
| labels := labelsFunc(ac.addrs[0]) | |
| return labels["grpc.lb.locality"], labels["grpc.lb.backend_service"] |
Follows from the principle of handling errors before proceeding to the rest of the code: See: go/go-style/decisions#indent-error-flow
| func (ac *addrConn) securityLevel() string { | ||
| var secLevel string | ||
| if ac.transport == nil { | ||
| secLevel, _ = ac.curAddr.Attributes.Value(securityLevelKey{}).(string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is a valid security_level when the connection is established (and the upDown counter is being incremented), and there is no valid security_level when the connection is disconnected (and the upDown counter is being decremented), does it affect the metric in anyway since the labels are going to be different on the two calls?
Also, an empty string is not a valid value for the security level as per the gRFC:
Allowed values - "none", "integrity_only" and "privacy_and_integrity".
So, would it make sense to drop the nil check here and instead pass none from where the disconnections are handled? Or if the change in the optional labels matters, would we have to store the security_level in the addrConn when the connection is established so that we can use the same value when the connection is terminated?
| } | ||
| locality, backendService := fetchLabels(ac) | ||
| if ac.state == connectivity.Ready || (ac.state == connectivity.Connecting && s == connectivity.Idle) { | ||
| disconnectionsMetric.Record(ac.cc.metricsRecorderList, 1, ac.cc.target, backendService, locality, "unknown") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the disconnect_error label set to unknown here? The gRFC talks about a set of possible values for this label.
| if ac.state == s { | ||
| return | ||
| } | ||
| locality, backendService := fetchLabels(ac) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of fetching the labels everytime the connectivity state changes, would it make sense to fetch them when the addrConn is created as part of handling NewSubConn and updating the labels when UpdateAddresses is invoked?
Addresses : https://github.com/grpc/proposal/blob/master/A94-subchannel-otel-metrics.md
this PR adds sub channel metrics with applicable labels as per the RFC proposal.
RELEASE NOTES: