The Regularizer - AI Model Discovery Platform

Description

Open-source models from DeepSeek, with the largest variant (V3) at 671B parameters trained on English and Chinese text:contentReference[oaicite:45]{index=45}:contentReference[oaicite:46]{index=46}.

Technical Specifications

Parameters

671

Context Length

4.1K

Architecture

Pre-norm decoder-only Transformer with RMSNorm, SwiGLU, RoPE, and GQA

Score

33.9

Typical Use Cases

Text generation
Chatbots
Coding assistance
Mathematical problem-solving
Language translation

DeepSeek-LLM

Description

Technical Specifications

Typical Use Cases

Model Information