Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default ramalama serve to only listen on localhost #876

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rhatdan
Copy link
Member

@rhatdan rhatdan commented Feb 24, 2025

Summary by Sourcery

Documentation:

  • Update documentation to reflect the change in default host address.

Copy link
Contributor

sourcery-ai bot commented Feb 24, 2025

Reviewer's Guide by Sourcery

This pull request modifies the ramalama serve command to default to listening only on localhost (127.0.0.0) instead of all interfaces (0.0.0.0). This change enhances security by preventing the service from being exposed to external networks by default. The configuration files, command-line argument defaults, and documentation have been updated to reflect this change.

Sequence diagram for ramalama serve interaction

sequenceDiagram
    participant User
    participant CLI
    participant Container

    User->>CLI: ramalama serve
    CLI->>CLI: Reads config (host = 127.0.0.0)
    CLI->>Container: Starts container with llama.cpp listening on 127.0.0.0
    Container-->>CLI: Container running
    CLI-->>User: Service running on localhost
Loading

File-Level Changes

Change Details Files
Updated the default host address for the serve command to listen on localhost (127.0.0.0) instead of all interfaces (0.0.0.0).
  • Changed the default value of the --host argument in the serve command to 127.0.0.0.
  • Updated the default host value in the configuration file to 127.0.0.0.
  • Updated documentation to reflect the change in the default host address.
test/system/040-serve.bats
docs/ramalama.conf
docs/ramalama.conf.5.md
ramalama/cli.py

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @rhatdan - I've reviewed your changes - here's some feedback:

Overall Comments:

  • It might be worth mentioning the security implications of binding to 0.0.0.0 vs 127.0.0.1 in the commit message.
Here's what I looked at during the review
  • 🟡 General issues: 3 issues found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@@ -42,7 +42,7 @@

# IP address for llama.cpp to listen on.
#
#host = "0.0.0.0"
#host = "127.0.0.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (bug_risk): Clarify the loopback IP change from 0.0.0.0 to 127.0.0.0.

Typically, '127.0.0.1' is used as the loopback address. Using '127.0.0.0' might unintentionally cause binding issues. Please confirm if this change is intentional.

ramalama/cli.py Outdated
@@ -856,7 +856,7 @@ def serve_parser(subparsers):
run_serve_perplexity_args(parser)
add_network_argument(parser, "")
parser.add_argument("-d", "--detach", action="store_true", dest="detach", help="run the container in detached mode")
parser.add_argument("--host", default=config.get('host', "0.0.0.0"), help="IP address to listen")
parser.add_argument("--host", default=config.get('host', "127.0.0.0"), help="IP address to listen")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (bug_risk): Review the host IP default in CLI arguments.

The default is now set to '127.0.0.0', which is uncommon for a loopback address. Confirm if this specific IP should be used or if '127.0.0.1' is more appropriate.

@@ -78,7 +78,7 @@ Run RamaLama using the specified container engine.
Valid options are: Podman and Docker
This field can be overridden by the RAMALAMA_CONTAINER_ENGINE environment variable.

**host**="0.0.0.0"
**host**="127.0.0.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Potential typo in IP address

The IP address 127.0.0.0 appears to be a typo. If the intention is to bind to the loopback interface, it should be 127.0.0.1. If all interfaces are intended, then 0.0.0.0 should be used. Given the previous value was 0.0.0.0, this is likely what was intended.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your feedback, we will generate more comments like this in the future.

Copy link
Contributor

@benoitf benoitf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

./bin/ramalama serve --port 9000 tinyllama                                                                                                             ✔  20.18.1 
build: 4763 (f777a73e) with Apple clang version 16.0.0 (clang-1600.0.26.6) for arm64-apple-darwin24.2.0
system info: n_threads = 8, n_threads_batch = 8, total_threads = 12

system_info: n_threads = 8 (n_threads_batch = 8) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |

main: couldn't bind HTTP server socket, hostname: 127.0.0.0, port: 9000

on macOS it needs to be 127.0.0.1

@ericcurtin
Copy link
Collaborator

ericcurtin commented Feb 24, 2025

I think this is a good idea, but just confirming. Will we still be able to access the server outside the container when we run:

"ramalama serve" ?

I think that's why we set it to 0.0.0.0 at least initially. I think this is worth a quick manual test to check.

I wouldn't be surprised if this config has kinda multiple contexts also, like being passed to podman, llama.cpp and/or vllm separately, can't remember.

@ericcurtin
Copy link
Collaborator

On all platforms it needs to be 127.0.0.1 , I think it's just a typo :)

@benoitf
Copy link
Contributor

benoitf commented Feb 24, 2025

@ericcurtin to your question: no it doesn't work with containers
image

so I can't reach the port as it's not exposed by the process inside the container

@benoitf benoitf dismissed their stale review February 24, 2025 16:39

pr updated

@ericcurtin
Copy link
Collaborator

ericcurtin commented Feb 24, 2025

I think we want that server somewhat accessible externally. Although 0.0.0.0 does seem over the top and insecure.

127.0.0.1 and only accessible inside the container also seems over the top, too secure.

I think what the user would expect as a default is that is behaves sort of as if we weren't containerized at all, like if we ran llama-server on 127.0.0.1 with no containers.

@ericcurtin
Copy link
Collaborator

@ericcurtin to your question: no it doesn't work with containers image

so I can't reach the port as it's not exposed by the process inside the container

And yeah it's important to test this change in particular with and without containers, because what's considered a local network is kinda different in both cases.

@benoitf
Copy link
Contributor

benoitf commented Feb 24, 2025

on macOS/Windows, there is a process on the host listening on 127.0.0.1 that redirects the traffic to the container but the process inside the containers needs to bind on 0.0.0.0

so maybe a special case when running inside containers ? b/c right now with this PR it's failing on macOS/Windows when using containers and the default command

@rhatdan
Copy link
Member Author

rhatdan commented Feb 24, 2025

So --host is only used with --nocontainer then? They should conflict if set.

@rhatdan rhatdan force-pushed the host branch 2 times, most recently from 24e3a58 to c9b37af Compare February 24, 2025 18:11
@ericcurtin
Copy link
Collaborator

ericcurtin commented Feb 24, 2025

I tested this on macOS, with no containers it's fine, works as expected. When using containers it's not fine, the port is inaccessible

@ericcurtin
Copy link
Collaborator

macOS is a funny case because we are also dealing with podman-machine networking, as well as the container, as well as macOS host OS.

@rhatdan rhatdan force-pushed the host branch 2 times, most recently from f689528 to 7dcb7fa Compare February 27, 2025 14:52
Currently RamaLama is listening on 0.0.0.0 which means that it
can listen on all ports, including ports that are exposed outside of the
host. Moving to 127.0.0.1 means that the service is only available on
the local system.

This will only effect llama.cpp running without containers.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
@ericcurtin
Copy link
Collaborator

Tested this again now that it's green:

via native macOS - works (127.0.0.1:8080 is accessible in the local network on the machine as no encapsulation is involved)
via macOS via podman machine - no connectivity to server
via linux via podman - no connectivity to server

If we merge this and people start using it, they will complain there is no connectivity to the server, maybe some of these options fix things I dunno:

  -p, --publish strings                          Publish a container's port, or a range of ports, to the host (default [])
  -P, --publish-all                              Publish all exposed ports to random ports on the host interface

@rhatdan
Copy link
Member Author

rhatdan commented Feb 27, 2025

With containers, you have to open the firewall rules to access, the same settings without this change would work.

@rhatdan
Copy link
Member Author

rhatdan commented Feb 27, 2025

Test on the MAC or Linux without the patch and the ports on the host will not be exposed, I believe.

@ericcurtin
Copy link
Collaborator

ericcurtin commented Feb 27, 2025

Without this PR, all 3 scenarios expose the port

@benoitf
Copy link
Contributor

benoitf commented Feb 27, 2025

I think it can't be merged like that else default experience for mac/windows users using containers will be painful

default should not be changed for at least these use cases
windows+ macOS using containers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants