Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telemetry: Child group spans aren't properly nested #172

Open
rubberduck203 opened this issue Dec 9, 2024 · 2 comments
Open

Telemetry: Child group spans aren't properly nested #172

rubberduck203 opened this issue Dec 9, 2024 · 2 comments

Comments

@rubberduck203
Copy link
Collaborator

Expected behavior

When one ScopeDoctorGroup needs another, I would expect the outermost group span to be a parent of the group(s) it imports.
For example:

scope doctor run    ==========================================
    doctor run             ==================================
        group (outer)           ====================
        group (inner)              =====
        group (sibling)                  =====

This would let us properly time each group and quickly see which groups are taking longest to run.

Actual Behavior

The parent group comes after its children at a sibling level.

Screenshot 2024-12-09 at 09 21 16

MVCE (Minimal Verifiable Complete Example)

zip archive for convenience

 unzip scope-telemetry-mvce.zip
 cd scope-telemetry-mvce
 docker compose up -d
 open http://localhost:16686
 SCOPE_OTEL_ENDPOINT=http://localhost:4318 scope doctor run

Directory structure

tree -a
.
├── .scope
│   ├── child-group.yaml
│   └── mvce.yaml
└── docker-compose.yml

mvce.yaml

apiVersion: scope.github.com/v1alpha
kind: ScopeDoctorGroup
metadata:
  name: mvce
spec:
  needs:
    - child-group
  actions: []

child-group.yaml

apiVersion: scope.github.com/v1alpha
kind: ScopeDoctorGroup
metadata:
  name: child-group
spec:
  include: when-required
  needs: []
  actions: []

docker-compose.yaml

name: jaeger
services:
    all-in-one:
        container_name: jaeger
        environment:
            - COLLECTOR_ZIPKIN_HOST_PORT=:9411
            - COLLECTOR_OTLP_ENABLED=true
        ports:
            - 6831:6831/udp
            - 6832:6832/udp
            - 5778:5778
            - 16686:16686
            - 4317:4317
            - 4318:4318
            - 14250:14250
            - 14268:14268
            - 14269:14269
            - 9411:9411
        image: jaegertracing/all-in-one:latest

Proposed Fix

The runner always sets the parent span for all groups to doctor run.

let group_span = info_span!(parent: &header_span, "group", "indicatif.pb_show" = true, "group.name" = group_name);

And the groups are passed as a Vec

async fn run_path(&self, groups: Vec<&GroupActionContainer<T>>) -> Result<PathRunResult> {

In order to fix this, we would need to preserve the tree of groups and ensure that only root nodes have their parent set to doctor run while child groups properly point to their parent group.

It's worth noting that it's actually a forest, as multiple root nodes are currently possible.

I'm sure there are other benefits to preserving this structure in the code that I haven't thought of.

@rubberduck203
Copy link
Collaborator Author

Upon further reflection, I believe that a valid scope config is a directed acyclic graph as multiple groups (including multiple root level groups) may have dependencies on the same groups.

That would be the correct data structure. Transforming the configuration into a DAG would ensure things are always executed in the correct order, validation would provide a guarantee no cycles are ever entered, and navigating the structure would naturally allow for proper spanning.

@rubberduck203
Copy link
Collaborator Author

So, we're already using a DiGraph data structure, which is great.

let mut graph = DiGraph::<&str, i32>::new();

but it does seem that we're not walking the directed graph during execution.
We're just returning a Vec<string> of group names to be executed.
That's why the spans don't nest properly, we're not actually opening the span of the parent group prior to opening the span of the child group.

pub fn compute_group_order(
groups: &BTreeMap<String, DoctorGroup>,
desired_groups: BTreeSet<String>,
) -> Vec<String> {

Tangentially related side note:
I don't believe we're computing the transitive reduction of the graph.
I may have just missed it, but petgraph has an implementation we should probably use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant