RX 7600 (8 GB) — QLoRA of 7B is tight. Use per_device_train_batch_size: 1 and gradient_checkpointing. Offload optimizer states to CPU if OOM.
ENV vars
export HSA_OVERRIDE_GFX_VERSION=11.0.0export PYTORCH_HIP_ALLOC_CONF=expandable_segments:True
Install hints
- Same install as gfx1100.
- Config: micro_batch_size: 1, gradient_checkpointing: true, optimizer: adamw_8bit
RX 7600 — 8 GB VRAM is tight. SD 1.5 works; SDXL requires --lowvram flag and is slow.
ENV vars
export HSA_OVERRIDE_GFX_VERSION=11.0.0export PYTORCH_HIP_ALLOC_CONF=expandable_segments:True
Install hints
- Same install as gfx1100.
- Launch with: python main.py --lowvram --listen
RX 7600 (8 GB) — 7B EXL2 4bpw fits with room to spare. 13B Q4 is tight but possible. Monitor VRAM with rocm-smi; context length may need reducing.
ENV vars
export HSA_OVERRIDE_GFX_VERSION=11.0.0
Install hints
- pip install torch --index-url https://download.pytorch.org/whl/rocm6.2
- pip install exllamav2
- Reduce max_seq_len if VRAM is exhausted.
RX 7600 — 8 GB VRAM; Q4_K_M models up to 8B. Offload remaining layers to CPU with --n-gpu-layers. Vulkan build is an alternative if ROCm gives trouble.
Install hints
- Same HIP build as gfx1100.
- Use --n-gpu-layers 30 to partially offload larger models.
- Vulkan alternative: cmake -B build -DGGML_VULKAN=ON && cmake --build build --config Release -j$(nproc)
RX 7600 — works on Linux with ROCm 6.x. Lower VRAM (8 GB) limits model size; stick to ≤7B Q4.
Install hints
- curl -fsSL https://ollama.com/install.sh | sh
- Limit to models ≤7B Q4 due to 8 GB VRAM.
RX 7600 — 8 GB VRAM. SD 1.5 works; SDXL needs --medvram or --lowvram flag and is slow.
ENV vars
export HSA_OVERRIDE_GFX_VERSION=11.0.0
Install hints
- Same as gfx1100.
- Launch with: ./webui.sh --medvram
RX 7600 (8 GB) — vLLM's default memory pre-allocation requires lowering --gpu-memory-utilization to 0.80 or less. Stick to 7B Q4/Q8 models.
ENV vars
export HSA_OVERRIDE_GFX_VERSION=11.0.0
Install hints
- pip install vllm --extra-index-url https://download.pytorch.org/whl/rocm6.2
- python -m vllm.entrypoints.openai.api_server --model <model> --gpu-memory-utilization 0.80
No data yet for: faster-whisper