Default ramalama serve to only listen on localhost #876

rhatdan · 2025-02-24T16:26:33Z

Summary by Sourcery

Documentation:

Update documentation to reflect the change in default host address.

sourcery-ai · 2025-02-24T16:26:39Z

Reviewer's Guide by Sourcery

This pull request modifies the ramalama serve command to default to listening only on localhost (127.0.0.0) instead of all interfaces (0.0.0.0). This change enhances security by preventing the service from being exposed to external networks by default. The configuration files, command-line argument defaults, and documentation have been updated to reflect this change.

Sequence diagram for ramalama serve interaction

sequenceDiagram
    participant User
    participant CLI
    participant Container

    User->>CLI: ramalama serve
    CLI->>CLI: Reads config (host = 127.0.0.0)
    CLI->>Container: Starts container with llama.cpp listening on 127.0.0.0
    Container-->>CLI: Container running
    CLI-->>User: Service running on localhost

File-Level Changes

Change	Details	Files
Updated the default host address for the `serve` command to listen on localhost (127.0.0.0) instead of all interfaces (0.0.0.0).	Changed the default value of the `--host` argument in the `serve` command to `127.0.0.0`. Updated the default `host` value in the configuration file to `127.0.0.0`. Updated documentation to reflect the change in the default host address.	`test/system/040-serve.bats` `docs/ramalama.conf` `docs/ramalama.conf.5.md` `ramalama/cli.py`

Possibly linked issues

Resolve undefined errors in huggingface.py and model.py #123: The PR changes the default host to 127.0.0.0, which will prevent the issue described from happening by default.

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!
Generate a plan of action for an issue: Comment @sourcery-ai plan on
an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey @rhatdan - I've reviewed your changes - here's some feedback:

Overall Comments:

It might be worth mentioning the security implications of binding to 0.0.0.0 vs 127.0.0.1 in the commit message.

Here's what I looked at during the review

🟡 General issues: 3 issues found
🟢 Security: all looks good
🟢 Testing: all looks good
🟢 Complexity: all looks good
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-02-24T16:27:20Z

docs/ramalama.conf

@@ -42,7 +42,7 @@

 # IP address for llama.cpp to listen on.
 #
-#host = "0.0.0.0"
+#host = "127.0.0.0"


question (bug_risk): Clarify the loopback IP change from 0.0.0.0 to 127.0.0.0.

Typically, '127.0.0.1' is used as the loopback address. Using '127.0.0.0' might unintentionally cause binding issues. Please confirm if this change is intentional.

sourcery-ai · 2025-02-24T16:27:20Z

ramalama/cli.py

@@ -856,7 +856,7 @@ def serve_parser(subparsers):
    run_serve_perplexity_args(parser)
    add_network_argument(parser, "")
    parser.add_argument("-d", "--detach", action="store_true", dest="detach", help="run the container in detached mode")
-    parser.add_argument("--host", default=config.get('host', "0.0.0.0"), help="IP address to listen")
+    parser.add_argument("--host", default=config.get('host', "127.0.0.0"), help="IP address to listen")


question (bug_risk): Review the host IP default in CLI arguments.

The default is now set to '127.0.0.0', which is uncommon for a loopback address. Confirm if this specific IP should be used or if '127.0.0.1' is more appropriate.

sourcery-ai · 2025-02-24T16:27:20Z

docs/ramalama.conf.5.md

@@ -78,7 +78,7 @@ Run RamaLama using the specified container engine.
 Valid options are: Podman and Docker
 This field can be overridden by the RAMALAMA_CONTAINER_ENGINE environment variable.

-**host**="0.0.0.0"
+**host**="127.0.0.0"


issue (bug_risk): Potential typo in IP address

The IP address 127.0.0.0 appears to be a typo. If the intention is to bind to the loopback interface, it should be 127.0.0.1. If all interfaces are intended, then 0.0.0.0 should be used. Given the previous value was 0.0.0.0, this is likely what was intended.

Good catch here

Thank you for your feedback, we will generate more comments like this in the future.

benoitf

./bin/ramalama serve --port 9000 tinyllama                                                                                                             ✔  20.18.1 
build: 4763 (f777a73e) with Apple clang version 16.0.0 (clang-1600.0.26.6) for arm64-apple-darwin24.2.0
system info: n_threads = 8, n_threads_batch = 8, total_threads = 12

system_info: n_threads = 8 (n_threads_batch = 8) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |

main: couldn't bind HTTP server socket, hostname: 127.0.0.0, port: 9000

on macOS it needs to be 127.0.0.1

ericcurtin · 2025-02-24T16:31:16Z

I think this is a good idea, but just confirming. Will we still be able to access the server outside the container when we run:

"ramalama serve" ?

I think that's why we set it to 0.0.0.0 at least initially. I think this is worth a quick manual test to check.

I wouldn't be surprised if this config has kinda multiple contexts also, like being passed to podman, llama.cpp and/or vllm separately, can't remember.

ericcurtin · 2025-02-24T16:32:10Z

On all platforms it needs to be 127.0.0.1 , I think it's just a typo :)

benoitf · 2025-02-24T16:38:49Z

@ericcurtin to your question: no it doesn't work with containers

so I can't reach the port as it's not exposed by the process inside the container

pr updated

ericcurtin · 2025-02-24T16:44:04Z

I think we want that server somewhat accessible externally. Although 0.0.0.0 does seem over the top and insecure.

127.0.0.1 and only accessible inside the container also seems over the top, too secure.

I think what the user would expect as a default is that is behaves sort of as if we weren't containerized at all, like if we ran llama-server on 127.0.0.1 with no containers.

ericcurtin · 2025-02-24T16:48:06Z

@ericcurtin to your question: no it doesn't work with containers

so I can't reach the port as it's not exposed by the process inside the container

And yeah it's important to test this change in particular with and without containers, because what's considered a local network is kinda different in both cases.

benoitf · 2025-02-24T16:48:22Z

on macOS/Windows, there is a process on the host listening on 127.0.0.1 that redirects the traffic to the container but the process inside the containers needs to bind on 0.0.0.0

so maybe a special case when running inside containers ? b/c right now with this PR it's failing on macOS/Windows when using containers and the default command

rhatdan · 2025-02-24T18:01:31Z

So --host is only used with --nocontainer then? They should conflict if set.

ericcurtin · 2025-02-24T18:16:10Z

I tested this on macOS, with no containers it's fine, works as expected. When using containers it's not fine, the port is inaccessible

ericcurtin · 2025-02-24T18:20:22Z

macOS is a funny case because we are also dealing with podman-machine networking, as well as the container, as well as macOS host OS.

Currently RamaLama is listening on 0.0.0.0 which means that it can listen on all ports, including ports that are exposed outside of the host. Moving to 127.0.0.1 means that the service is only available on the local system. This will only effect llama.cpp running without containers. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>

ericcurtin · 2025-02-27T17:41:24Z

Tested this again now that it's green:

via native macOS - works (127.0.0.1:8080 is accessible in the local network on the machine as no encapsulation is involved)
via macOS via podman machine - no connectivity to server
via linux via podman - no connectivity to server

If we merge this and people start using it, they will complain there is no connectivity to the server, maybe some of these options fix things I dunno:

  -p, --publish strings                          Publish a container's port, or a range of ports, to the host (default [])
  -P, --publish-all                              Publish all exposed ports to random ports on the host interface

rhatdan · 2025-02-27T18:31:21Z

With containers, you have to open the firewall rules to access, the same settings without this change would work.

rhatdan · 2025-02-27T18:32:01Z

Test on the MAC or Linux without the patch and the ports on the host will not be exposed, I believe.

ericcurtin · 2025-02-27T18:55:33Z

Without this PR, all 3 scenarios expose the port

benoitf · 2025-02-27T19:12:37Z

I think it can't be merged like that else default experience for mac/windows users using containers will be painful

default should not be changed for at least these use cases
windows+ macOS using containers

rhatdan requested review from ericcurtin, bmahabirbu, maxamillion, swarajpande5, jhjaggars, cgruver, slp and engelmi as code owners February 24, 2025 16:26

sourcery-ai bot reviewed Feb 24, 2025

View reviewed changes

benoitf previously requested changes Feb 24, 2025

View reviewed changes

rhatdan force-pushed the host branch from 573ff39 to 589bdaf Compare February 24, 2025 16:36

rhatdan force-pushed the host branch 2 times, most recently from 24e3a58 to c9b37af Compare February 24, 2025 18:11

rhatdan force-pushed the host branch 2 times, most recently from f689528 to 7dcb7fa Compare February 27, 2025 14:52

rhatdan force-pushed the host branch from 7dcb7fa to 783ffc8 Compare February 27, 2025 16:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default ramalama serve to only listen on localhost #876

Default ramalama serve to only listen on localhost #876

rhatdan commented Feb 24, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Feb 24, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

sourcery-ai bot left a comment

sourcery-ai bot Feb 24, 2025

sourcery-ai bot Feb 24, 2025

sourcery-ai bot Feb 24, 2025

ericcurtin Feb 24, 2025

sourcery-ai bot Feb 24, 2025

benoitf left a comment

ericcurtin commented Feb 24, 2025 •

edited

Loading

ericcurtin commented Feb 24, 2025

benoitf commented Feb 24, 2025

ericcurtin commented Feb 24, 2025 •

edited

Loading

ericcurtin commented Feb 24, 2025

benoitf commented Feb 24, 2025

rhatdan commented Feb 24, 2025

ericcurtin commented Feb 24, 2025 •

edited

Loading

ericcurtin commented Feb 24, 2025

ericcurtin commented Feb 27, 2025

rhatdan commented Feb 27, 2025

rhatdan commented Feb 27, 2025

ericcurtin commented Feb 27, 2025 •

edited

Loading

benoitf commented Feb 27, 2025

Default ramalama serve to only listen on localhost #876

Are you sure you want to change the base?

Default ramalama serve to only listen on localhost #876

Conversation

rhatdan commented Feb 24, 2025 • edited by sourcery-ai bot Loading

Summary by Sourcery

sourcery-ai bot commented Feb 24, 2025 • edited Loading

Reviewer's Guide by Sourcery

Sequence diagram for ramalama serve interaction

File-Level Changes

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

sourcery-ai bot left a comment

Choose a reason for hiding this comment

sourcery-ai bot Feb 24, 2025

Choose a reason for hiding this comment

sourcery-ai bot Feb 24, 2025

Choose a reason for hiding this comment

sourcery-ai bot Feb 24, 2025

Choose a reason for hiding this comment

ericcurtin Feb 24, 2025

Choose a reason for hiding this comment

sourcery-ai bot Feb 24, 2025

Choose a reason for hiding this comment

benoitf left a comment

Choose a reason for hiding this comment

ericcurtin commented Feb 24, 2025 • edited Loading

ericcurtin commented Feb 24, 2025

benoitf commented Feb 24, 2025

ericcurtin commented Feb 24, 2025 • edited Loading

ericcurtin commented Feb 24, 2025

benoitf commented Feb 24, 2025

rhatdan commented Feb 24, 2025

ericcurtin commented Feb 24, 2025 • edited Loading

ericcurtin commented Feb 24, 2025

ericcurtin commented Feb 27, 2025

rhatdan commented Feb 27, 2025

rhatdan commented Feb 27, 2025

ericcurtin commented Feb 27, 2025 • edited Loading

benoitf commented Feb 27, 2025

rhatdan commented Feb 24, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Feb 24, 2025 •

edited

Loading

ericcurtin commented Feb 24, 2025 •

edited

Loading

ericcurtin commented Feb 24, 2025 •

edited

Loading

ericcurtin commented Feb 24, 2025 •

edited

Loading

ericcurtin commented Feb 27, 2025 •

edited

Loading