Integrating LLMs with Services using Harbor

Preamble

As your self-hosted AI stack grows, you might want to use an LLM toolkit to easily manage services. One such tool is harbor. This toolkit allows you to easily install, manage and integrate Docker Compose stacks with simple CLI commands or an optional GUI.

Installing an LLM Stack With Harbor

Install The Necessary Dependencies

Git
Fedora
Debian
Arch
Docker and Docker Compose

Follow Docker’s official install documentation for your distribution here

NVIDIA Container Toolkit (Optional)

NVIDIA Container Toolkit is required for CUDA support.

Fedora
  1. Add NVIDIA’s Repository
  1. (optional) Enable Experimental Packages
  1. Install the NVIDIA Container Toolkit
Debian

Add NVIDIA’s production repository

(optional) Enable experimental packages

Install the NVIDIA Container Toolkit

Arch

Install the NVIDIA Container Toolkit

Configure Docker
Restart the Docker Daemon

Add User to Docker Group

  1. Create docker Group
  1. Add user to the docker group
  1. Log out and in then verify that docker runs without root

If docker still asks for root permissions, verify docker group status with groups $USER and restart.

Install Harbor CLI

bash

You can use the unsafe one liner to install Harbor. Ensure you review the code first.

Optionally you can install Harbor with a package manager

npm
PyPI

Install Harbor App (optional)

If you’re running a Desktop Environment, a GUI app is available from the release pages, however it is still relatively new and may not behave as expected. If you are on Debian, you can install the latest .deb package. Download the latest version and install it with dpkg

If you’re using a non-debian based distro, you can install the appimage either manually or using a tool like bauh or appimagelauncher.

To install manually, download the latest version then change permissions to make it executable.

You can then launch the application

Once the GUI has launched you can free the app from the terminal with

Windows and Mac OS users can use the .dmg or .exe/msi packages for their respective operating systems.

Harbor App Overview

Overview of the features of the Harbor App
1. Filter based on service type
2. Search for service name
3. Stop a service
4. Open a running service (broken in current AppImage)
5. Service type
6. Whether service is set to start by default with harbor up command
7. Start service
8. Open Documentation – Currently broken in AppImage. Right click to copy/paste URL as a workaround.

Running Open-WebUI stack with Harbor CLI

Harbor allows you to pick and choose the services and models you would like to integrate in to your AI workflow. You can refer to the Harbor user guide for a full list of commands.

Default (Open-WebUI + Ollama)

For now I want to focus on running a default stack of Open-WebUI with Ollama as the frontend and backend.

Backends

I’ll also run a few additional inference backends

Frontends

I’ll integrate Flux in Open-WebUI for text-to-image support

Open http://localhost:340341 and let ComfyUI pull the Stable Diffusion models from huggingface.

Satellite Services

You can add integrate additional tools by running select satellite services. I would not advise running everything at once as tempting as it may be. Doing so will add complexity to your setup. Read through the features of each service and pick only the ones you need.

Each of these will require further configuration depending on what you wish to do with them which is beyond the scope of this post. Refer to their respective documentation for guidance.

Models

Deepseek has been making waves in the news lately, so time to test it out! I already tested out the r1 model in a previous post about Open-WebUI with Docker so this time I’ll test deepseek-coder, as some of my satellite services are development specific. You can pull Ollama models using harbor with “harbor ollama pull <model name>“.

You can also pull gguf models directly from huggingface for use with different inference backends

  1. Download
  1. Locate the file
  1. Set the path to the model

Example:

  1. Download the model

for Dolphin3.0-R1-Mistral-24B

Another model I’m interested in is a text-to-video model

Note: If the model is gated/private, you can set your huggingface access token with

  1. Locate the model you wish to use. E.g.
  1. Set the path (repeat steps to switch between each model) – /app/models/hub/ is the mount point inside the container. Your model will be different. Check the output of your find command.
  1. Restart llamacpp to take effect

Updating Docker Containers

Update default stack

Update specific service

Once the images are pulled, restart to take effect

Opening Harbor Services

You can open harbor services using the “harbor open <service>” command

Next Steps

From here explore to your hearts content! Look at Harbor’s GitHub page periodically to check for new features.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.