~/runthismodel
daemon okbuild 5a3c91d00:00:00Z

Curated collections

Hand-picked shortlists across 145+ models. Pick the list that matches your hardware or your job — skip the comparison spreadsheet.

8 GB tier8 picks

Best models for 8 GB VRAM

RTX 3060 / 3070 / M-series Mac with 16 GB unified memory

Curated picks that comfortably fit on an 8 GB GPU. Each ships in a Q4_K_M quant that leaves headroom for a 4–8K context window. Sorted by quality-per-byte.

12 GB tier8 picks

Best models for 12 GB VRAM

RTX 3060 12 GB / RTX 4070 / M-series Mac with 24 GB

12 GB unlocks bigger 12–14B models with a comfortable context window. Strong sweet spot for code, vision, and reasoning.

24 GB tier8 picks

Best models for 24 GB VRAM

RTX 3090 / 4090 / 5090 / M-series Mac with 36 GB+

24 GB is the LLM enthusiast sweet spot — 30 B class models at Q4 with 32K+ context, plus full FLUX.1 image generation.

4 GB tier8 picks

Best models for iPhone & iPad

Recent A17/A18/M-series with 8 GB+ unified memory

Sub-3 B parameter models that ship with our iOS app and run entirely on-device. No internet, no API keys — just inference.

Use case8 picks

Best coding models

From 1 B autocomplete to 14 B agentic refactoring

Code-specialised models ranked by HumanEval-class performance. Pair with Cursor / Continue / Aider for an offline copilot.

Use case6 picks

Best reasoning models

Chain-of-thought / o1-style local thinkers

Models trained to show their work. Ideal for math, code, and multi-step logic puzzles. All run with `<think>` traces enabled.

Use case6 picks

Best vision models

Local GPT-4V replacements

Image-in / text-out — describe screenshots, parse documents, count objects, read receipts. All work fully offline.

Use case7 picks

Best image-generation models

From SD 1.5 to FLUX.1

Local Stable Diffusion / Flux family. Each line in our blur-to-sharp Compare lab uses one of these.

Use case7 picks

Best voice models (STT + TTS)

Whisper + Piper + Kokoro

Speech-in / speech-out building blocks for offline voice assistants. Pair Whisper for STT with Piper or Kokoro for TTS.

Use case8 picks

Uncensored chat & assistant models

Refusal-removed general LLMs — abliterated, Dolphin, and natural base models

Local LLMs without the 'I can't help with that' reflex. Includes mlabonne abliterations (refusal-direction ablation, no retraining), Cognitive Computations Dolphin fine-tunes, and official base models that were never RLHF-aligned. Read each model's license — some inherit Llama Community terms.

Use case9 picks

Uncensored creative writing & roleplay

TheDrummer, Sao10K, Anthracite — long-form prose and character chat

Models fine-tuned for narrative writing and character roleplay without alignment filters. Cydonia and Rocinante (TheDrummer), Euryale and Stheno (Sao10K), and Magnum (Anthracite) are the active reference families. Some carry non-commercial licenses — check before commercial use.

Use case1 picks

Uncensored coding models

Code generation without filters

Coding-specialized models with refusal direction ablated. Useful for security research, dual-use tooling, and code that mainstream-aligned assistants decline to write. Codestral derivatives inherit Mistral's non-commercial research license.

Use case2 picks

Naturally uncensored base models

Official foundation models with no instruct or RLHF alignment

Pretrained-only models from Mistral, Qwen, and Meta — no abliteration needed because alignment was never applied. Closer to the raw distribution; less assistant-shaped, more open-ended. Best for researchers and for fine-tuning your own assistant.