Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install Daft on existing Kubernetes cluster and submit jobs using daft-launcher #44

Merged
merged 5 commits into from
Jan 23, 2025

Conversation

jessie-young
Copy link
Collaborator

@jessie-young jessie-young commented Jan 20, 2025

Support Daft on Kubernetes

This PR introduces support for Daft on existing Kubernetes clusters (BYOC - Bring Your Own Cluster) and improves documentation around cluster setup and SSH configuration.

Major Changes

1. Kubernetes Setup Documentation

  • Added detailed guides for setting up Ray and Daft on existing Kubernetes clusters:
    • Local development (kind/minikube)
    • Cloud-managed clusters (EKS/GKE/AKS)
    • On-premise deployments
  • Centralized Ray installation instructions using KubeRay
  • Added architecture-specific instructions (ARM64 vs x86)

2. CLI Enhancements & Command Support

Current command support matrix:

Command AWS Mode K8s Mode
init
up
submit
stop
kill
list
connect
ssh
sql

3. SSH Key Setup Documentation

Added instructions for SSH key setup in the README. Would appreciate review on:

  • Whether the instructions are clear and accurate
  • Whether we should automate SSH key creation/setup during daft up

Questions for Reviewers

  1. Provider Naming:

    • Currently using "aws" for EC2-based provisioning
    • Consider renaming to "managed" to differentiate from BYOC EKS clusters?
    • Current: aws vs k8s
    • Proposed: managed vs byoc or managed vs k8s
  2. Command Structure:

    • Current: daft up (fails for k8s with "not supported" message)
    • Alternative proposal: daft aws up / daft k8s submit
    • Would explicit provider in command be clearer?
    • Trade-off between verbosity and clarity

Testing Done

  • Tested Ray installation on:
    • Local kind cluster (both x86 and ARM64)
    • EKS cluster spun up by daft-launcher
  • Verified some of the commands in both modes - init and submit(will hold off on verifying the rest until after the high level approach detailed in this PR has been approved).
  • Tested SSH key setup process

Next Steps

  • Address reviewer feedback on naming conventions
  • Potentially implement SSH key automation
  • Move K8s documentation to separate repo after review

…r, and support BYOC k8s clusters in daft-launcher

Added docs for kuberay + daft installation, fixed minor linter issue
Copy link
Contributor

@raunakab raunakab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far, but there are some configuration file changes that I think we should look over.

Also, I noticed that your editor removes the newlines (\n) from the end of the file. If you don't mind, could you try to keep them in? It usually is stylistic best practice to keep them in.

@jessie-young jessie-young removed the request for review from jaychia January 22, 2025 01:55
raunakab
raunakab previously approved these changes Jan 23, 2025
Copy link
Contributor

@raunakab raunakab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@raunakab raunakab merged commit db3b824 into master Jan 23, 2025
1 check passed
@raunakab raunakab deleted the k8s branch January 23, 2025 05:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants