How to Run Local LLMs with Ollama on AMD GPU (Complete Guide)

Mar 10, 2025    #homelab   #ollama   #llm   #amd   #rocm   #gpu   #local-llm  

This is mainly being put here for reference for me. I will writeup nix instructions when I finally migrate my main system to nix, but at the moment this is on arch & I wanted to document the process for myself.

I am using an AMD gpu so this may differ for you:

Install AMD GPU backend packages:

sudo pacman -S rocminfo rocm-opencl-sdk rocm-hip-sdk rocm-ml-sdk

Install ollama with AMD GPU support.

yay -S ollama-rocm

Start Ollama:

ollama serve

Run open-webui via docker:

services:
  open-webui:
    build:
      context: .
      dockerfile: Dockerfile
    image: ghcr.io/open-webui/open-webui:${WEBUI_DOCKER_TAG-main}
    container_name: open-webui
    volumes:
      - ./open-webui:/app/backend/data
    network_mode: host
    environment:
      - 'OLLAMA_BASE_URL=http://127.0.0.1:11434'
    restart: unless-stopped

volumes:
  open-webui: {}

Access Open WebUI:

http://localhost:8080

Pull A Model Down:

System Requirements

Common Model Commands

# List all installed models
ollama list

# Remove a model
ollama rm model-name

# Get model information
ollama show model-name

# Run a model in CLI
ollama run model-name

Troubleshooting

Additional Resources

Introduction

This guide demonstrates how to run Large Language Models (LLMs) locally using Ollama with AMD GPU acceleration. While many guides focus on NVIDIA GPUs, this tutorial specifically covers AMD GPU setup using ROCm on Arch Linux. Running LLMs locally provides better privacy, reduced latency, and no API costs.

Performance Optimization

AMD GPU-Specific Settings

Model Optimization



Next: Rebuilding My Homelab In NixOS (Part 4) Decrypting boot early with initrd-ssh