-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
initial studio agent, client awareness and operation reporting #309
Changes from 1 commit
f4d71ae
b03ead8
a1b6439
50a24d9
083ca02
4822655
e5b8f2a
6ff030d
faa3272
a63748f
85da9eb
ceff8dc
5c03ffc
06a6117
f84cdb1
c839ec9
9280c35
8def90a
d60aaea
de28a3b
8abaebc
de78fa0
24836db
de66553
1e374b3
9715096
d0b0d45
96cea73
31e2520
1a339dc
29a8f42
4c370f3
348cb81
b33fb06
4548890
084f55c
63391e4
8a049fa
b00bdda
7e76cf0
91d6231
f8bcdc9
52c7e09
fd82b60
ca8f210
eaa97d8
297d8d1
346cc77
244f7a9
663ce7b
e208be1
5cefecf
8cd1ee4
ad289e3
9819840
d8d630b
e705d6f
c0fafa9
b99d796
eee04d9
8cd56ec
f0e4bd8
99526d0
87314b7
b2a5c28
949bf03
aaf8ba0
9e73d28
3ffff59
8a86a88
b87ff86
3f873b6
b2afdc0
0d86b50
92988c6
a2bb359
7026a33
27caf08
25ad732
6c5be65
6ff86de
a7d7a59
220ccaa
9f2e421
fd7f130
6da5f92
7da9dcc
78d2004
07001d2
1b15dad
f1a78a9
13565b6
1f6ef44
03d809b
d5b2d7f
2b01014
2b78f00
af58cb9
92b6df9
d6fb85e
2e3efba
adb2e7b
042f837
b9ed1b8
2c83393
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
Use the approved stats_report_key from the TS implementation, although I may not be using them entirely correctly just now... Also: fix the format error that David noted and remove the "space"
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -232,7 +232,7 @@ impl SpanExporter for Exporter { | |||
tracing::debug!(index, %span.name, ?span.start_time, ?span.end_time); | ||||
if span.name == "graphql_request" { | ||||
tracing::debug!("span: {:?}", span); | ||||
if let Some(q) = span | ||||
if let Some(query) = span | ||||
.attributes | ||||
.get(&opentelemetry::Key::from_static_str("query")) | ||||
{ | ||||
|
@@ -241,7 +241,7 @@ impl SpanExporter for Exporter { | |||
span.end_time.duration_since(span.start_time).unwrap(), | ||||
1, | ||||
); | ||||
tracing::debug!("query: {}", q); | ||||
tracing::debug!("query: {}", query); | ||||
|
||||
let not_found = Value::String(Cow::from("not found")); | ||||
let stats = ContextualizedStats { | ||||
|
@@ -268,7 +268,7 @@ impl SpanExporter for Exporter { | |||
.attributes | ||||
.get(&opentelemetry::Key::from_static_str("operation_name")); | ||||
// XXX NEED TO NORMALISE THE QUERY | ||||
let key = normalize(operation_name, &q.as_str()); | ||||
let key = normalize(operation_name, &query.as_str()); | ||||
|
||||
let msg = reporter | ||||
.submit_stats(graph.clone(), key, stats) | ||||
|
@@ -285,22 +285,31 @@ impl SpanExporter for Exporter { | |||
} | ||||
} | ||||
|
||||
fn normalize(op: Option<&opentelemetry::Value>, q: &str) -> String { | ||||
// If we don't have an operation name, no point normalizing | ||||
// it. Just return the unprocessed input. | ||||
// Taken from TS implementation | ||||
static GRAPHQL_PARSE_FAILURE: &str = "## GraphQLParseFailure\n"; | ||||
static GRAPHQL_VALIDATION_FAILURE: &str = "## GraphQLValidationFailure\n"; | ||||
static GRAPHQL_UNKNOWN_OPERATION_NAME: &str = "## GraphQLUnknownOperationName\n"; | ||||
|
||||
fn normalize(op: Option<&opentelemetry::Value>, query: &str) -> String { | ||||
// If we don't have an operation name, we can't do anything useful | ||||
// with this query. Just return the appropriate error. | ||||
let op_name: String = match op { | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you help me understand where this operation name comes from? It looks like it comes from an otel attribute which I think comes from
My naive interpretation of that code is that this is specifically the out of band operationName explicitly specified in a GraphQL request, ie like But in the very common case where a named query is provided without an explicit out of band operationName, ie a request like I think maybe this was something that you were waiting on more apollo-rs support for before you can implement it properly? In that case this should probably be clearly called out via a comment as not currently implementing the right logic. (If my cursory read of the code is wrong and this will properly get There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are correct. I'll raise an issue to add a comment for this. Is this a problem that could be solved by enhancing the router_bridge interface as your previously described? (I think not, but I'm asking anyway). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it could — IMO it doesn't really make sense to "resolve" the operation name for a request until you've validated the request (because then you can rely on uniqueness of operation names, etc). |
||||
Some(v) => v.as_str().into_owned(), | ||||
None => return q.to_string(), | ||||
None => { | ||||
tracing::warn!("Could not identify operation name: {}", query); | ||||
return GRAPHQL_UNKNOWN_OPERATION_NAME.to_string(); | ||||
} | ||||
}; | ||||
|
||||
let parser = Parser::new(q); | ||||
let parser = Parser::new(query); | ||||
// compress *before* parsing to modify whitespaces/comments | ||||
let ast = parser.compress().parse(); | ||||
tracing::debug!("ast:\n {:?}", ast); | ||||
// If we can't parse the query, we definitely can't normalize it, so | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We used to do that but that led to (a) tons of cardinality coming from broken input and (b) unsanitized operations getting sent to us. Instead we have special stats report keys for the error cases of "parse error" (invalid GraphQL syntax), "validation error" (operation doesn't work with the schema), and "unknown operation name" (the document is valid but the separately-provided operation name doesn't match an operation in the doc, or isn't provided and there are multiple ops there). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is WIP until the normalisation support in apollo-rs is available. I'll look at this again then. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. By the way I thought it might be helpful for me to be more explicit about what the three kinds of top-level errors mean, since it's a bit subtle. A GraphQL request (over HTTP/JSON) consists of up to three fields:
(we also have added GraphQL requests are executed in the context of a
The other kind of errors that can arise happen at execution time. Most of them are "field errors"; the one other kind of "request error" is if a variable value cannot be coerced to its declared type. |
||||
// just return the un-processed input | ||||
// just return the appropriate error. | ||||
if ast.errors().len() > 0 { | ||||
return q.to_string(); | ||||
tracing::warn!("Could not parse query: {}", query); | ||||
return GRAPHQL_PARSE_FAILURE.to_string(); | ||||
} | ||||
let doc = ast.document(); | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a step here that applies the standard GraphQL Validation algorithm? ie, the thing that checks that the query actually works with the graph, and that it has all sorts of necessary internal consistencies? Does this happen elsewhere in the router, or is it something that's waiting on apollo-rs support? ie, that means implementing all the logic in http://spec.graphql.org/draft/#sec-Validation Anyway, this algorithm requires us to know whether the operation passes validation before checking for operation names. (Otherwise you can't assume things like "there's at most one operation for a given name" or "if there's an anonymous operation then it's the only one".) So that should go here. Though presumably this is less a matter of "we should run the validation operation in the middle of ... ... ok, I'm now realizing that the way that validation works in router today (confirmed with @abernix) is that we run the graphql validation in graphql-js over the bridge. (or actually... we might not actually run validation at all right now? but if so, that's a bug and the plan I believe is to do it in graphql-js for now. validate runs on version-0.x but not main.) So I think what you need to do is extend the router bridge API so that the status of GraphQL validation comes back explicitly from the JS query planner, and that that boolean is made available to the telemetry code. That said... if you're already making this normalize function dependent on something returned from JS... maybe the best short term plan is to just do the entire stats report key calculation in JS. I'd be happy to collaborate on making the stats report key code in apollo-server a bit more accessible to router-bridge code, or honestly copy-pasting it there would be fine too. This would mean you'd be able to have gateway-compatible stats report keys right now before apollo-rs gives you all the tools you need; implementing stats report key generation in Rust can be part of the broader project of porting JS stuff to Rust rather than a release blocker. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is no logic here to validate the graphQL. I think your suggestion to extend the router bridge makes most sense for this body of problems. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||||
tracing::debug!("{}", doc.format()); | ||||
garypen marked this conversation as resolved.
Show resolved
Hide resolved
|
||||
|
@@ -310,25 +319,24 @@ fn normalize(op: Option<&opentelemetry::Value>, q: &str) -> String { | |||
.into_iter() | ||||
.filter(|x| { | ||||
if let ast::Definition::OperationDefinition(op_def) = x { | ||||
match op_def.name() { | ||||
Some(v) => { | ||||
return v.text() == op_name; | ||||
} | ||||
None => { | ||||
return op_name == "-"; | ||||
} | ||||
} | ||||
return match op_def.name() { | ||||
Some(v) => v.text() == op_name, | ||||
None => op_name == "-", | ||||
}; | ||||
} | ||||
false | ||||
}) | ||||
.collect(); | ||||
tracing::debug!("required definitions: {:?}", required_definitions); | ||||
assert_eq!(required_definitions.len(), 1); | ||||
if required_definitions.len() != 1 { | ||||
garypen marked this conversation as resolved.
Show resolved
Hide resolved
|
||||
tracing::warn!("Could not find required single definition: {}", query); | ||||
return GRAPHQL_VALIDATION_FAILURE.to_string(); | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Whether this should be a validation failure or an unknown operation name failure depends on exactly what's going on here, though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right. If we extend the router-bridge as described I presume we'll have the information we need and won't need to do this checking. |
||||
} | ||||
let required_definition = required_definitions.pop().unwrap(); | ||||
tracing::debug!("required_definition: {:?}", required_definition); | ||||
// XXX Somehow find fragments... | ||||
let def = required_definition.format(); | ||||
format!("# {} \n{}", op_name, def) | ||||
format!("# {}\n{}", op_name, def) | ||||
} | ||||
|
||||
struct DurationHistogram { | ||||
|
@@ -360,7 +368,6 @@ impl DurationHistogram { | |||
return DurationHistogram::MAXIMUM_SIZE; | ||||
} | ||||
|
||||
println!("unbounded_bucket: {}", unbounded_bucket); | ||||
unbounded_bucket as usize | ||||
} | ||||
|
||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,13 +1,8 @@ | ||
--- | ||
source: apollo-router/src/apollo_telemetry.rs | ||
assertion_line: 279 | ||
assertion_line: 409 | ||
expression: normalized | ||
|
||
--- | ||
|
||
{ | ||
user { | ||
name | ||
} | ||
} | ||
## GraphQLUnknownOperationName | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,13 +1,8 @@ | ||
--- | ||
source: apollo-router/src/apollo_telemetry.rs | ||
assertion_line: 293 | ||
assertion_line: 423 | ||
expression: normalized | ||
|
||
--- | ||
|
||
query { | ||
user { | ||
name | ||
} | ||
} | ||
## GraphQLUnknownOperationName | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,8 @@ | ||
--- | ||
source: apollo-router/src/apollo_telemetry.rs | ||
assertion_line: 308 | ||
assertion_line: 438 | ||
expression: normalized | ||
|
||
--- | ||
# OpName | ||
# OpName | ||
query OpName { user { name } } |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,20 +1,8 @@ | ||
--- | ||
source: apollo-router/src/apollo_telemetry.rs | ||
assertion_line: 329 | ||
assertion_line: 459 | ||
expression: normalized | ||
|
||
--- | ||
|
||
{ | ||
user { | ||
name | ||
...Bar | ||
} | ||
} | ||
fragment Bar on User { | ||
asd | ||
} | ||
fragment Baz on User { | ||
jkl | ||
} | ||
## GraphQLUnknownOperationName | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd highly encourage you to name this function something like "statsReportKey" as that's the name we use for this in our protocol. "normalize" is a very ambiguous name as there are many different operation normalization algorithms in our platform (and basically every time we use it it includes stripping comments, so none of them would yield a stats report key with the comment-ish first line).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll rename it to "stats_report_key". Rust prefers snake case for function names:
https://doc.rust-lang.org/1.0.0/style/style/naming/README.html