Can RTX 5070 run BGE Large EN v1.5?

Yes — runs locally

~132 tok/sec · Instant — feels like typing. No noticeable delay.

Your VRAM

12 GB

Model size

0.335B

Best quant

FP16

VRAM needed

1.1 GB

The verdict

The RTX 5070 (12 GB VRAM) handles BGE Large EN v1.5 comfortably using the FP16 quantization, which fits in 1.1 GB. Expected throughput is around 132 tokens/second, which feels Instant — feels like typing. No noticeable delay. in interactive use. High quality English embedding model. Best accuracy for English search.

How to run it

1. Install Ollama or LM Studio.
2. Pull the FP16 GGUF — best balance of quality and speed on 12 GB.
3. Start chatting. Expect ~132 tok/sec on first-token, faster after warmup.

See full BGE Large EN v1.5 setup →

Other models that run great on RTX 5070

FAQ (20)

What GPU do I need to run BGE Large EN v1.5?

To run BGE Large EN v1.5, you need a GPU with at least 0.8 GB to 1.1 GB of VRAM, depending on the quantization level.

Is BGE Large EN v1.5 good for coding?

BGE Large EN v1.5 is primarily designed for high-quality English embeddings and may not be optimized for coding tasks, which typically require different model architectures.

BGE Large EN v1.5 vs Llama 3.1 8B?

BGE Large EN v1.5 has 0.335 billion parameters, making it smaller and more efficient than Llama 3.1 8B, which has 8 billion parameters. BGE Large EN v1.5 is better suited for embedding tasks, while Llama 3.1 8B excels in general language understanding.

Can I run BGE Large EN v1.5 on a Mac?

Yes, you can run BGE Large EN v1.5 on a Mac as long as your system meets the VRAM requirements (0.8 GB to 1.1 GB) and you have the necessary software dependencies installed.

How much VRAM does BGE Large EN v1.5 need?

BGE Large EN v1.5 requires between 0.8 GB and 1.1 GB of VRAM, depending on the quantization level used.

Is BGE Large EN v1.5 censored?

BGE Large EN v1.5 is not explicitly censored, but it is designed to produce high-quality embeddings and may include content filters to ensure safe and appropriate outputs.

Is BGE Large EN v1.5 commercial-use allowed?

Yes, BGE Large EN v1.5 is licensed under the MIT license, which allows for both commercial and non-commercial use.

BGE Large EN v1.5 context length?

BGE Large EN v1.5 has a context length of 512 tokens, which is suitable for most embedding tasks.

Want personalized recommendations for your exact setup? Detect my hardware →