-
Notifications
You must be signed in to change notification settings - Fork 930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs/nodes-grouping #4519
base: main
Are you sure you want to change the base?
docs/nodes-grouping #4519
Conversation
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
Thanks for the PR, @Huongg - great job! Here are a few initial comments:
![]() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general content looks good to me, but I'm not sure the table is the right format. It's quite small and the bullet points aren't formatted nicely inside the columns. Perhaps you can just write the content inside paragraphs with clear headings?
I've also left some comments on the phrasing, I find the tone of the text a bit too marketing style, like we're trying to sell this feature instead of explaining it. I'd suggest changing the tone to make it more informative and explain the why more than the what.
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
7fae928
to
53dae3c
Compare
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
hey @DmitrySorokinQB and @merelcht thanks for reviewing this PR! I agree that the table format was quite hard to read in this form. I've updated it to use paragraphs instead, but I've also included a summary table at the end to highlight the key points and make it easier for users. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @Huongg and team! Left a few comments.
More broadly, the "what works in Kedro - what doesn't work in Kedro - how to use" section titles feel a bit repetitive, but the "best used when - not to use when" are useful. Maybe when we're settled on the content we can turn some of these bullet lists into prose?
 | ||
|
||
### What doesn't work with Kedro | ||
- If you need to create custom node groupings that are different with your existing pipelines, creating new pipelines for them may not be convenient. Instead, you can use alternative grouping methods such as tags or namespaces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand this sentence 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think what I'm trying to say is "if you want to group nodes to something different than the current pipeline structure, instead of creating a new pipeline, using tags or namespace can help you with that"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make it clearer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"If you want to group nodes differently from the current pipeline structure, instead of creating a new pipeline, you can use tags or namespaces to achieve that." ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, now I understand! However, it's not really a disadvantage right? It's just that users have an alternative in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah thats true, it also doesnt fit in any categories. Maybe there's no need to mention this at all.
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
I had a first read, and overall, it looks great! Initially, I felt that node grouping could technically be done without Kedro-Viz. That said, visualizing how you group nodes is crucial, and Kedro-Viz plays a key role in that. The Viz screenshots also make it much easier to understand each approach we take. |
Docs are broken because MLflow just migrated theirs to Docusaurus 👀 mlflow/mlflow#13645 |
New link is https://mlflow.org/docs/latest/tracking/server/ |
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
Unclear where this error comes from, the link looks just okay 🤔 maybe add a final slash?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, thanks a lot @Huongg 🙏🏼 two departing comments:
- The hierarchy still seems broken for some reason, compare
latest | this pr |
---|---|
![]() |
![]() |
- I still think it would be good to turn the bullet points of what was "what works - what doesn't work" into prose. The bullet points of "best used when - not to use when" do look good though, as a kind of summary.
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
hey @astrojuanlu, Thanks for the review and for catching the bug in the content navigation! I've updated the structure to reflect the fix. ![]() I also consolidated all the points, and they look much better now. As for the failed test, it’s passing now—sometimes links break randomly but resolve themselves after a new build. |
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. thanks!
Just one question - in the navigation panel, under 'Deployment', does it make sense to keep 'Node Grouping at the beginning' rather than at the end because the grouping comes early in the workflow.
hey @rashidakanchwala good point, however I think its good practice to have the navigation bar reflects the structure of the content page. So if we change the order of the navigation bar, we will need to consider to change the structure of the content page too. cc @iamelijahko and @stephkaiser to get their thoughts on this |
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some small comments on wording. The most important point I think is around the "when not to use" section for namespaces.
**Not to use when** | ||
- The tagged nodes have strong dependencies, which might cause execution failures. | ||
- Tags are not hierarchical, so tracking groups of nodes can become difficult. | ||
- Tags do not enforce structure like pipelines or namespaces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this last point isn't so much a Not to use when
but more a note on tags. Maybe it makes sense to mention this before the best used when/not to use when sections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DONE 👍
- Your pipeline structure is well-defined, and using namespaces improves visualisation in Kedro-Viz. | ||
- Customising deployment groups by adding namespaces at the node level. | ||
|
||
**Not to use when** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like in this section these are more phrased like "downsides" of namespaces rather than Not to use when
points. Maybe we can say something like: not to use when namespaces overcomplicate an otherwise simple pipeline structure or not to use when editing the catalog is not wanted. Maybe a bit more elaborate, but in any case I think the point we need to bring across here is that namespaces will add complexity and so it's not advised to use them in simple projects where pipeline grouping is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, I've updated the copy to highlight the key point you mentioned. Let me know what you think ? @merelcht
**Not to use when**
- In small and simple projects, using namespaces can introduce unnecessary complexity, making pipeline grouping a more suitable choice.
- Namespaces require additional effort, such as updating catalog names, since namespace prefixes are automatically applied to all the elements unless explicitly overridden in the namespaced pipeline parameters.
I also included the example on how users can overrides the parameters directly in the pipeline function if they dont want to rename in the catalog
You will need to update your catalog accordingly. If you don't want to change the names of your inputs, outputs, or parameters with the
namespace_name.
prefix while using a namespace, you should list these objects inside the corresponding parameters of thepipeline()
creation function. For example:return pipeline( base_pipeline, namespace = "new_namespaced_pipeline", # With that namespace, "new_namespaced_pipeline" prefix will be added to inputs, outputs, params, and node names inputs={"the_original_input_name"}, # Inputs remain the same, without namespace prefix )
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Signed-off-by: Huong Nguyen <huong.nguyen@mckinsey.com>
Description
closes #4441
Documentation to guide users on various ways to group their nodes in Kedro, including pipelines, tags, and namespaces.
Rendered version: Link
Development notes
Developer Certificate of Origin
We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a
Signed-off-by
line in the commit message. See our wiki for guidance.If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.
Checklist
RELEASE.md
file