Best models for 24 GB VRAM

RTX 3090 / 4090 / 5090 / M-series Mac with 36 GB+

24 GB is the LLM enthusiast sweet spot — 30 B class models at Q4 with 32K+ context, plus full FLUX.1 image generation.

  1. 1

    Alibaba

    Qwen 2.5 32B

    Premium 32B model. Top-tier reasoning. Mac with 32GB+ RAM.

    32B18.99 GB
  2. 2

    Google

    Gemma 3 27B

    Google's flagship open model. Near GPT-4 quality. Needs 20GB+ RAM.

    27B15.91 GB
  3. 3

    Mistral AI

    Mistral Small 22B

    22B parameter model. Strong reasoning and multilingual. Needs 16GB+ RAM.

    22B12.93 GB
  4. 4

    Alibaba

    Qwen 2.5 Coder 14B

    Powerful 14B code model. Excellent for complex programming tasks.

    14B8.87 GB
  5. 5

    Black Forest Labs

    FLUX.1 Dev (GGUF)

    Highest quality FLUX model. 20-50 steps. Mac with 24GB+ RAM.

    12B14 GB
  6. 6

    Stability AI

    Stable Diffusion XL (CoreML)

    Higher quality image generation. CoreML optimized for iOS. Requires 6GB+ usable memory (iPad/Mac recommended).

    3.5B3.34 GB
  7. 7

    Stability AI

    Stable Diffusion 3 Medium (GGUF)

    SD 3 with MMDiT architecture. Superior text rendering.

    2.5B9.15 GB
  8. 8

    01.AI

    Yi 1.5 9B Chat

    9B bilingual model with strong reasoning.

    9B5.46 GB

Not sure which fits your machine? Auto-detect your hardware →