rocmate — gfx1030 (RX 6800/6900)

Axolotl 🟡 partial ROCm 6.2

RX 6800/6900 — QLoRA of 7B works. Full fine-tuning requires gradient checkpointing. Flash-attention fallback path is slower than RDNA3.

ComfyUI 🟡 partial ROCm 6.2

SDXL is slow; SD 1.5 works fine. Memory pressure with larger models.

ExLlamaV2 🟡 partial ROCm 6.2

RX 6800/6900 — functional but kernel tuning may differ from RDNA3. Performance is good for quantized models; throughput ~60-70 % of gfx1100.

faster-whisper 🟡 partial ROCm 6.2

PyTorch ROCm wheels available, but transcription is slower than gfx1100.

pip install torch torchaudio --index-url https://download.pytorch.org/whl/rocm6.2

llama.cpp ✅ tested ROCm 6.2

RX 6800/6900 — works out of the box with HIP build. Performance slightly below gfx1100. Vulkan build is an alternative if ROCm gives trouble.

Same HIP build as gfx1100.
Vulkan alternative: cmake -B build -DGGML_VULKAN=ON && cmake --build build --config Release -j$(nproc)

Ollama ✅ tested ROCm 6.3

Works on RX 6800/6900 with ROCm 6.x. Some older guides recommend HSA_OVERRIDE — no longer needed.

Stable Diffusion WebUI 🟡 partial ROCm 6.2

RX 6800/6900 — works with HSA_OVERRIDE. SDXL slow; SD 1.5 fine. Use --medvram for stability.

vLLM 🟡 partial ROCm 6.2

RX 6800/6900 — works but slower than RDNA3 cards. PagedAttention works. Flash-attention backend may fall back to a slower kernel.