Open-source models from DeepSeek, with the largest variant (V3) at 671B parameters trained on English and Chinese text:contentReference[oaicite:45]{index=45}:contentReference[oaicite:46]{index=46}.
Technical Specifications
Parameters
671
Context Length
4.1K
Architecture
Pre-norm decoder-only Transformer with RMSNorm, SwiGLU, RoPE, and GQA
Score
33.9
Typical Use Cases
Text generation
Chatbots
Coding assistance
Mathematical problem-solving
Language translation
Model Information
Type
Commercial
License
MIT License for code; DeepSeek License Agreement for models