Small AI Models (Under 3B)
Small models under 3 billion parameters are designed for edge deployment, mobile devices, and hardware-constrained environments. Despite their compact size, modern small models deliver surprisingly capable performance for many tasks. They start up quickly, require minimal VRAM (often under 2GB), and can run on devices as modest as a Raspberry Pi or older smartphone. These models are ideal for real-time applications, embedded systems, and privacy-focused local deployment.
Sentence Transformers
all-MiniLM-L6-v2
Tiny embedding model. Only 23MB. Perfect for on-device search.
BAAI
BGE Small EN v1.5
Compact English embedding model. Good for basic semantic search.
Nomic AI
Nomic Embed Text v1.5
High quality text embedding model. 137M params. Good for RAG and search.
Alibaba
Qwen 2.5 1.5B
Compact 1.5B model with strong multilingual and coding abilities.
BAAI
BGE Large EN v1.5
High quality English embedding model. Best accuracy for English search.
OpenAI
Whisper Large v3 Turbo
Optimized large Whisper model. Near-best accuracy with faster inference.
BAAI
BGE Reranker v2 M3
Multilingual reranker. 100+ languages. 1.1GB.
Alibaba
Qwen 2.5 0.5B
Ultra-small 0.5B model from Alibaba. Minimal resource requirements.
OpenAI
Whisper Large v3
Largest Whisper model. Best accuracy across all languages and accents.
Meta
Llama 3.2 1B Instruct
Ultra-compact 1B model. Runs on virtually any device including smartphones.
TinyLlama
TinyLlama 1.1B
Lightweight 1.1B chat model based on Llama architecture. Great for phones.
Moondream
Moondream 2
Ultra-compact vision model. Only 1GB. Answers questions about images.
Alibaba
Qwen2-VL 2B
Compact vision-language model. Default multimodal model. Can understand images and answer questions about them.
OpenAI
Whisper Small
Compact Whisper model. Good accuracy for everyday transcription tasks.
Runway
Stable Diffusion 1.5 (CoreML)
Classic image generation model. Pre-converted to CoreML for iOS/Mac. Downloads as zip, auto-extracts.
OpenAI
Whisper Base
Base whisper model. Good balance of speed and accuracy. 142MB.
HuggingFace
Distil-Whisper Large v3
Distilled Whisper. 6x faster than large-v3 with 1% accuracy loss.
HuggingFace
SmolLM2 135M
Tiny 135M model. Default LLM - guaranteed to run on any iPhone. Only 145MB download. Perfect for quick experiments.
Gemma 3 1B
Google's latest tiny 1B model. Excellent quality for its size.
OpenAI
Whisper Tiny
Tiny multilingual speech recognition. Only 75MB. Supports 99 languages. Runs on any device.
DeepSeek
DeepSeek R1 Distill 1.5B
Compact reasoning model distilled from DeepSeek R1. Strong chain-of-thought in a tiny package.
Alibaba
Qwen 2.5 Coder 1.5B
Compact code model with solid code generation and understanding abilities.
OpenAI
Whisper Medium
Mid-size Whisper model. Strong multilingual speech recognition.
HuggingFace
SmolLM2 360M
Compact 360M model. Good for basic tasks on very constrained devices.
Alibaba
Qwen 2.5 Coder 0.5B
Smallest code model. Default code assistant - runs on any iPhone. Great for code completion and simple programming tasks.
Gemma 2 2B
Google's compact 2.6B model. Efficient and capable for mobile use.
OpenBMB
MiniCPM-V 2.6
Efficient multimodal model with strong image understanding. Optimized for edge devices.
HuggingFace
SmolLM2 1.7B
Capable 1.7B model from HuggingFace. Good balance for mobile devices.
Meta
MusicGen Small
Music generation from text prompts. Requires multiple ONNX files (~435MB total). Experimental iOS support.
OpenAI
Whisper Tiny English (Quantized)
Smallest possible speech recognition model. Only 32MB. English only. Default speech model - guaranteed to run on any iPhone.
Kokoro
Kokoro 82M TTS
High quality 82M parameter TTS model. Excellent speech synthesis with multiple voice options. 86MB download.
OpenAI
Whisper Base English
English-only base model. Faster and more accurate for English.
DeepSeek
DeepSeek Coder 1.3B
Compact code model with strong coding capabilities. Great for mobile coding assistants.
LG AI
EXAONE 3.5 2.4B
Compact model from LG. Optimized for Korean and English.
Snowflake
Snowflake Arctic Embed S
Compact embedding model from Snowflake. Good multilingual support.
H2O.ai
Danube 3 500M
Ultra-tiny 500M model. Even smaller than SmolLM. Runs anywhere.
IBM
Granite 3.3 2B
IBM's compact 2B model. Good at following instructions.
CodeGemma 2B
Lightweight code completion model from Google. Fast on-device code suggestions.
TII
Falcon 3 1B
Ultra-compact 1B model from Technology Innovation Institute.
Stability AI
Stable Diffusion 3 Medium (GGUF)
SD 3 with MMDiT architecture. Superior text rendering.
Jina AI
Jina Reranker Tiny EN
Tiny English reranker. Only 67MB. Use with embedding models for better search.
Runway / GPUStack
Stable Diffusion 1.5 (GGUF)
SD 1.5 in single-file GGUF format. Alternative to CoreML. Uses stable-diffusion.cpp with Metal acceleration.
01.AI
Yi Coder 1.5B
Tiny code model. Great for phones. Fast completions.
Stability AI / Apple
Stable Diffusion 2.1 Base (CoreML)
Smallest CoreML image generation model. Palettized for minimal size (1.14GB). Runs on any iPhone with 6GB RAM. Default image generation model.
Rhasspy
Piper TTS - Amy (English)
Lightweight TTS voice. High quality English speech synthesis. Default TTS model - runs on any iPhone. Only 63MB.
Rhasspy
Piper TTS - Lessac (English)
High quality English male voice. 63MB download. Runs on any device.
Rhasspy
Piper TTS - LibriTTS-R (English)
Medium quality English voice with natural prosody. 63MB download.
Stability AI
Stable Diffusion 2.1 (GGUF)
SD 2.1 in GGUF format. Better quality than 1.5.
Rhasspy
Piper TTS - Spanish (MLS)
Spanish female voice. Natural prosody.
Rhasspy
Piper TTS - French (Siwis)
French female voice.
Rhasspy
Piper TTS - German (Thorsten)
German male voice.
Rhasspy
Piper TTS - Chinese (Huayan)
Chinese Mandarin voice.
Rhasspy
Piper TTS - Japanese (Kokoro)
Japanese voice.
Rhasspy
Piper TTS - Korean
Korean voice.
Rhasspy
Piper TTS - Russian (Irina)
Russian female voice.
Rhasspy
Piper TTS - Portuguese (Faber)
Portuguese voice.
Rhasspy
Piper TTS - Italian (Riccardo)
Italian male voice.
Rhasspy
Piper TTS - Arabic (Kareem)
Arabic voice.