Improve the containerized deployment story #39
Replies: 4 comments 7 replies
-
|
Thanks for the feedback, I've moved this to a discussion since it's a better fit for general feedback - issues need to be clear and focused, describe a single, specific, actionable bug or feature request with enough detail to reproduce or implement. This post covers several different topics (entrypoint customization, logging, config management, authentication, Ollama compatibility) each of which would ideally be its own issue with sufficient context if you'd like to see specific changes made. The docs for Docker is at https://llmspy.org/docs/deployment/docker which includes examples for how to run it, e.g: docker run -p 8000:8000 -e OPENROUTER_API_KEY="your-key" ghcr.io/servicestack/llms:latestTo help manage multiple environment variables and other configuration using Docker Compose is recommended. You can also use environment variables for Verbose ( It's not clear how you're attempting to run a custom commands like
Not clear about this issue, it should be reading your llms.json from your volume, if it doesn't exist it has to create it, but you should be able to use your own custom one if it exists. You can start with the default llms.json from the repo and customize to suit. Where you should be able statically define only the models you want available with map_models otherwise it will attempt to load the models at runtime from your Ollama endpoint. Not sure what's causing your Ollama error, try enable Verbose and Debug Logging to see if it can provide more context about the error. AuthenticationI plan on adding a simple credentials auth system after the major features I'm currently working on (Themes / Custom Agents). Feel free to watch that issue to get notified when it's available. ComplexityUnfortunately general feedback like this that aren't actionable on their own - they don't say what specifically was confusing or what documentation was missing. If you ran into a specific step where the docs were wrong, unclear, or absent, that's something that can be looked at. Every project could be "easier" but without knowing the specific gaps, there's nothing concrete to improve. As this is a new project I don't expect any of it to be in LLM training sets, I try to publish extensive documentation at https://llmspy.org/docs/ that should hopefully cover most use cases. |
Beta Was this translation helpful? Give feedback.
-
The container accepts commands. If no command is given, it will run as a server. But if I give it a command after the image name, say |
Beta Was this translation helpful? Give feedback.
-
This wouldn't do. The example is large and I would have to keep updating my starting llms.json as the project evolves. That's harder than manual edits. Secondly, this would be a one-time initialization persisted in the volume, so I would not be able to change the provider list just by rebuilding my container. Not good for software-defined environments. Most server software reads immutable config files that can be baked into a container image or a VM by configuration/build scripts. That makes for straightforward setup that LLMs can handle well. |
Beta Was this translation helpful? Give feedback.
-
I had to adapt this a bit for my environment (podman, systemd quadlets, scripted setup), but I mostly followed it (missed the VERBOSE=1 though, thanks for the tip). One thing I noticed is that tricks like |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I tried to setup llms.py in a podman container and connect it to my local Ollama instances, but I have encountered several issues and the test drive ultimately failed despite a lot of effort. Let me describe the main pain points.
Entrypoint script: There's a prebuilt container image (
ghcr.io/servicestack/llms:latest), which is good, and it asks for a single volume to save data, which is again good. The trouble is that the volume requires initialization. There seems to be some entrypoint script that handles volume initialization, but that means I cannot customize llms parameters like port and verbosity. At least my attempts to run the container with something likellms --serve 12345 --verbosehave failed. Things only started working once I started relying on the built-in entrypoint script for volume initialization and generally proper server setup. I let llms.py run on its default port and expose a different port via podman configuration. I have no idea how to enable verbose logging now.Logging: Not sure how llms.py writes logs, but I had to set log driver in podman to
journaldto see any logs, because nothing was logged under default settings. Even then the error log is no more detailed than the brief error I got in the UI. As explained above, I wasn't able to turn on verbose logging.Baked-in configuration: There's no way to bake my custom Ollama endpoints into the container image or to otherwise supply them to llms.py automatically. I am expected to manually edit llms.json after the volume is initialized. In order to automate that, I would have to (1) run the service once and wait for the volume to be initialized, (2) read llms.json from the volume, (3) have a Python script patch in my Ollama endpoints, and (4) write the modified llms.json into the volume while the service is temporarily stopped. That's way too complicated. Why cannot we just drop a file with predefined configuration somewhere that llms.py would merge into its dynamic configuration? I have also found no way to disable tools by default in the config file.
Failing Ollama requests: So I have added one Ollama endpoint manually. Its models show up in the UI, but when I try to start a chat, I get
[Errno None] Can not write request body for http://localhost:11436/v1/chat/completionserror. Logs don't say anything more. It might be because it's a year old Intel IPEX fork of Ollama, which is based on an even older upstream. I will try to upgrade someday and try again. But then OpenAI endpoint is part of Ollama for ages. It should work even if the Ollama version is old.Authentication: I gather there's some github auth extension, but that's an overkill and maybe a security problem of its own in a local setup. Why not a simple username/password? Without authentication, I wonder how much access to llms.py is granted to random websites I visit by llms.py's CORS configuration. Flatpak apps have full access to localhost ports too. Fronting by a reverse proxy is not an option on localhost.
Complexity: I have spent several hours exploring numerous blind alleys and asked ChatGPT like 20 different questions about various aspects of setting up llms.py. ChatGPT often had to dig in the source code for answers, probably due to insufficient documentation. This ought to be easier.
Anyways, I like what you are doing here and I hope llms.py will keep improving. For now, I just want to leave feedback from my test drive here. Feel free to discard any part that does not align with your goals.
Beta Was this translation helpful? Give feedback.
All reactions