Integrating LLMs with Services using Harbor

Preamble

As your self-hosted AI stack grows, you might want to use an LLM toolkit to easily manage services. One such tool is harbor. This toolkit allows you to easily install, manage and integrate Docker Compose stacks with simple CLI commands or an optional GUI.

Installing an LLM Stack With Harbor

Install The Necessary Dependencies

Git

Fedora

sudo dnf install git

Debian

sudo apt install git

Arch

sudo pacman -S git

Docker and Docker Compose

Follow Docker’s official install documentation for your distribution here

NVIDIA Container Toolkit (Optional)

NVIDIA Container Toolkit is required for CUDA support.

Fedora

Add NVIDIA’s Repository

curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
  sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo

(optional) Enable Experimental Packages

sudo yum-config-manager --enable nvidia-container-toolkit-experimental

Install the NVIDIA Container Toolkit

sudo yum install -y nvidia-container-toolkit

Debian

Add NVIDIA’s production repository

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

(optional) Enable experimental packages

sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list

Install the NVIDIA Container Toolkit

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit

Arch

Install the NVIDIA Container Toolkit

sudo pacman -S nvidia-container-toolkit

Configure Docker

sudo nvidia-ctk runtime configure --runtime=docker

Restart the Docker Daemon

sudo systemctl restart docker

Add User to Docker Group

Create docker Group

sudo groupadd docker

Add user to the docker group

sudo usermod -aG docker $USER

Log out and in then verify that docker runs without root

docker run hello-world

If docker still asks for root permissions, verify docker group status with groups $USERgroups $USER and restart.

Install Harbor CLI

bash

You can use the unsafe one liner to install Harbor. Ensure you review the code first.

curl https://av.codes/get-harbor.sh | bash

Optionally you can install Harbor with a package manager

npm

npm install -g @avcodes/harbor

PyPI

pipx install llm-harbor

Install Harbor App (optional)

If you’re running a Desktop Environment, a GUI app is available from the release pages, however it is still relatively new and may not behave as expected. If you are on Debian, you can install the latest .deb package. Download the latest version and install it with dpkg

sudo dpkg -i /path/to/Harbor_0.2.25_amd64.deb

If you’re using a non-debian based distro, you can install the appimage either manually or using a tool like bauh or appimagelauncher.

To install manually, download the latest version then change permissions to make it executable.

sudo chmod u+x /path/to/Harbor_0.2.25_amd64.AppImage

You can then launch the application

/path/to/Harbor_0.2.25_amd64.AppImage &

Once the GUI has launched you can free the app from the terminal with

disown

Windows and Mac OS users can use the .dmg or .exe/msi packages for their respective operating systems.

Harbor App Overview

Overview of the features of the Harbor App — 1. Filter based on service type
2. Search for service name
3. Stop a service
4. Open a running service (broken in current AppImage)
5. Service type
6. Whether service is set to start by default with harbor up command
7. Start service
8. Open Documentation – Currently broken in AppImage. Right click to copy/paste URL as a workaround.

Running Open-WebUI stack with Harbor CLI

Harbor allows you to pick and choose the services and models you would like to integrate in to your AI workflow. You can refer to the Harbor user guide for a full list of commands.

Default (Open-WebUI + Ollama)

For now I want to focus on running a default stack of Open-WebUI with Ollama as the frontend and backend.

harbor up

Backends

I’ll also run a few additional inference backends

harbor up llamacpp vllm sglang ktransformers

Frontends

I’ll integrate Flux in Open-WebUI for text-to-image support

harbor up comfyui

Open http://localhost:340341 and let ComfyUI pull the Stable Diffusion models from huggingface.

Satellite Services

You can add integrate additional tools by running select satellite services. I would not advise running everything at once as tempting as it may be. Doing so will add complexity to your setup. Read through the features of each service and pick only the ones you need.

harbor up tts langflow sqlchat searxng llamacpp comfyui bolt

Each of these will require further configuration depending on what you wish to do with them which is beyond the scope of this post. Refer to their respective documentation for guidance.

Models

Deepseek has been making waves in the news lately, so time to test it out! I already tested out the r1 model in a previous post about Open-WebUI with Docker so this time I’ll test deepseek-coder, as some of my satellite services are development specific. You can pull Ollama models using harbor with “harbor ollama pull <model name>“.

harbor ollama pull deepseek-coder

You can also pull gguf models directly from huggingface for use with different inference backends

Download

harbor hf download <user/repo> <file.gguf>

Locate the file

harbor find file.gguf

Set the path to the model

harbor config set llamacpp.model.specifier -m /app/models/<path/to/file.gguf>

Example:

Download the model

for Dolphin3.0-R1-Mistral-24B

harbor hf download bartowski/cognitivecomputations_Dolphin3.0-R1-Mistral-24B-GGUF cognitivecomputations_Dolphin3.0-R1-Mistral-24B-Q4_K_M.gguf

Another model I’m interested in is a text-to-video model

harbor hf download pollockjj/ltx-video-2b-v0.9.1-gguf ltx-video-2b-v0.9.1-F16.gguf

Note: If the model is gated/private, you can set your huggingface access token with

harbor hf token <your HF token>

Locate the model you wish to use. E.g.

harbor find ltx-video-2b-v0.9.1-F16.gguf

harbor find cognitivecomputations_Dolphin3.0-R1-Mistral-24B-Q4_K_M.gguf

Set the path (repeat steps to switch between each model) – /app/models/hub/ is the mount point inside the container. Your model will be different. Check the output of your find command.

# switch to ltx-video
harbor llamacpp gguf /app/models/hub/models--pollockjj--ltx-video-2b-v0.9.1-gguf/snapshots/e5a475abf7bc740c6952ad5c7babee947de0907/ltx-video-2b-v0.9.1-F16.gguf

# switch to Dolphin-R1-Mistral
harbor llamacpp gguf /app/models/hub/models--bartowski--cognitivecomputations_Dolphin3.0-R1-Mistral-24B-GGUF/snapshots/6379a7231ae315b324f7f78cbe12a4ecb2794e49/cognitivecomputations_Dolphin3.0-R1-Mistral-24B-Q4_K_M.gguf

Restart llamacpp to take effect

harbor restart llamacpp

Updating Docker Containers

Update default stack

harbor pull

Update specific service

harbor pull <service name>

Once the images are pulled, restart to take effect

harbor restart

Opening Harbor Services

You can open harbor services using the “harbor open <service>” command

harbor open webui

Next Steps

From here explore to your hearts content! Look at Harbor’s GitHub page periodically to check for new features.