RX 6800/6900 — QLoRA of 7B works. Full fine-tuning requires gradient checkpointing. Flash-attention fallback path is slower than RDNA3.
Install hints
- Same install as gfx1100.
- Add gradient_checkpointing: true and micro_batch_size: 1 to your config.
SDXL is slow; SD 1.5 works fine. Memory pressure with larger models.
ENV vars
export HSA_OVERRIDE_GFX_VERSION=10.3.0
Install hints
- Same as gfx1100 install; expect ~3-4x slower SDXL generation.
RX 6800/6900 — functional but kernel tuning may differ from RDNA3. Performance is good for quantized models; throughput ~60-70 % of gfx1100.
Install hints
- pip install torch --index-url https://download.pytorch.org/whl/rocm6.2
- pip install exllamav2
PyTorch ROCm wheels available, but transcription is slower than gfx1100.
Install hints
- pip install torch torchaudio --index-url https://download.pytorch.org/whl/rocm6.2
RX 6800/6900 — works out of the box with HIP build. Performance slightly below gfx1100. Vulkan build is an alternative if ROCm gives trouble.
Install hints
- Same HIP build as gfx1100.
- Vulkan alternative: cmake -B build -DGGML_VULKAN=ON && cmake --build build --config Release -j$(nproc)
Works on RX 6800/6900 with ROCm 6.x. Some older guides recommend HSA_OVERRIDE — no longer needed.
Install hints
- curl -fsSL https://ollama.com/install.sh | sh
RX 6800/6900 — works with HSA_OVERRIDE. SDXL slow; SD 1.5 fine. Use --medvram for stability.
ENV vars
export HSA_OVERRIDE_GFX_VERSION=10.3.0
Install hints
- Same install as gfx1100.
- Launch with: ./webui.sh --medvram
RX 6800/6900 — works but slower than RDNA3 cards. PagedAttention works. Flash-attention backend may fall back to a slower kernel.
Install hints
- pip install vllm --extra-index-url https://download.pytorch.org/whl/rocm6.2