Can RTX 4070 Ti SUPER run BGE Small EN v1.5?

Yes — runs locally

~144 tok/sec · Instant — feels like typing. No noticeable delay.

Your VRAM

16 GB

Model size

0.033B

Best quant

Q8_0

VRAM needed

0.1 GB

The verdict

The RTX 4070 Ti SUPER (16 GB VRAM) handles BGE Small EN v1.5 comfortably using the Q8_0 quantization, which fits in 0.1 GB. Expected throughput is around 144 tokens/second, which feels Instant — feels like typing. No noticeable delay. in interactive use. Compact English embedding model. Good for basic semantic search.

How to run it

1. Install Ollama or LM Studio.
2. Pull the Q8_0 GGUF — best balance of quality and speed on 16 GB.
3. Start chatting. Expect ~144 tok/sec on first-token, faster after warmup.

See full BGE Small EN v1.5 setup →

Other models that run great on RTX 4070 Ti SUPER

FAQ (20)

What GPU do I need to run BGE Small EN v1.5?

BGE Small EN v1.5 requires a minimum of 0.1 GB of VRAM, so most modern GPUs should suffice. However, for optimal performance, a GPU with at least 2 GB of VRAM is recommended.

Is BGE Small EN v1.5 good for coding?

BGE Small EN v1.5 is primarily an embedding model, which is not specifically designed for coding tasks. It is more suitable for semantic search and natural language understanding tasks.

BGE Small EN v1.5 vs Llama 3.1 8B?

BGE Small EN v1.5 has only 0.033 billion parameters, making it much smaller and more lightweight compared to Llama 3.1 8B, which has 8 billion parameters. BGE Small EN v1.5 is better suited for resource-constrained environments and simpler tasks like semantic search.

Can I run BGE Small EN v1.5 on a Mac?

Yes, you can run BGE Small EN v1.5 on a Mac. Ensure your Mac has a compatible GPU or sufficient CPU resources to handle the model.

How much VRAM does BGE Small EN v1.5 need?

BGE Small EN v1.5 requires a minimum of 0.1 GB of VRAM. The exact VRAM usage can vary slightly depending on the quantization level used.

Is BGE Small EN v1.5 censored?

BGE Small EN v1.5 is not explicitly censored. However, as an embedding model, it is designed to provide neutral and contextually relevant embeddings without generating explicit content.

Is BGE Small EN v1.5 commercial-use allowed?

Yes, BGE Small EN v1.5 is licensed under the MIT License, which allows for both commercial and non-commercial use.

BGE Small EN v1.5 context length?

BGE Small EN v1.5 has a context length of 512 tokens, which is suitable for most basic semantic search and natural language processing tasks.

Want personalized recommendations for your exact setup? Detect my hardware →