The Regularizer - AI Model Discovery Platform

Description

A suite of foundation models (7B–65B parameters) trained on 1.4T tokens, with the 65B version outperforming GPT-3 on many benchmarks:contentReference[oaicite:15]{index=15}.

Technical Specifications

Parameters

65

Context Length

2.0K

Architecture

Autoregressive decoder-only Transformer with SwiGLU activation, RoPE positional embeddings, and RMSNorm

Score

17.3

Typical Use Cases

Text generation
Language translation
Summarization
Question answering
Research applications

LLaMA

Description

Technical Specifications

Typical Use Cases

Model Information