DeepSeek TUI is model-agnostic at the wiring layer but shines when paired with DeepSeek V4 via API. This reference collects related families people still deploy locally or fine-tune—complete with typical sizes for discovery on Hugging Face or GGUF mirrors.
Mixture-of-experts in one sentence
Many flagship DeepSeek checkpoints use MoE: large total parameter counts with smaller subsets active per token. That impacts VRAM planning—always read the model card for actives vs totals before downloading quantized builds.
Families & tables
⚡ DeepSeek V4
Current-generation models DeepSeek positions for agentic coding and long context. DeepSeek TUI is tuned around V4 APIs and workflows.
Model
Params
Context
License
Notes
deepseek-v4-flash
Mixture-of-experts (public docs)
1M tokens
API terms
Cost-efficient default for tooling-heavy sessions.
deepseek-v4-pro
Mixture-of-experts (public docs)
1M tokens
API terms
Stronger reasoning; higher price (discount periods may apply).
🔬 DeepSeek R1
Reasoning-focused line associated with chain-of-thought style outputs. Often compared to other reasoning models for math and logic-heavy prompts.
Large distilled variant for strong local runs if you have GPUs.
📦 DeepSeek V3
Prior flagship MoE generation; still widely referenced for Hugging Face downloads and GGUF conversions.
Model
Params
Context
License
Notes
DeepSeek-V3
671B MoE (reported)
128K (typical)
MIT
Successor narrative continues with V4 for newest API features.
DeepSeek-V3-0324
671B MoE
128K
MIT
Point release commonly mirrored on Hugging Face.
💻 DeepSeek Coder
Code-specialized models popular for completion and repository tasks before V-class unified lines absorbed much of that traffic.
Model
Params
Context
License
Notes
deepseek-coder-1.3b-base
1.3B
16K
DeepSeek license
Tiny baseline for constrained hardware.
deepseek-coder-6.7b-base
6.7B
16K
DeepSeek license
Useful small coder baseline.
deepseek-coder-33b-base
33B
16K
DeepSeek license
Strong local coder if VRAM allows.
deepseek-coder-v2-instruct
MoE (236B total, 21B active reported)
128K
DeepSeek license
Instruct-tuned coding MoE.
🖼️ DeepSeek VL / VL2
Vision-language variants for image+text prompts; relevant when your workflow mixes screenshots, diagrams, or UI captures.
Model
Params
Context
License
Notes
DeepSeek-VL2
MoE variants
Multimodal
DeepSeek license
Check Hugging Face cards for exact variant sizes (Tiny/Small/etc.).
DeepSeek-VL
7B class
Multimodal
DeepSeek license
Earlier VL line; still referenced in quantization/GGUF repos.
Running locally
GGUF, AWQ, and other quantized builds vary by maintainer. Start with the official model card, then follow Local deployment for Ollama, vLLM, or SGLang routing into DeepSeek TUI.