The Regularizer - AI Model Discovery Platform

Description

A 70B-parameter model trained on more data to follow the optimal compute/data scaling laws, attaining superior accuracy (e.g., 67.5% on MMLU):contentReference[oaicite:11]{index=11}:contentReference[oaicite:12]{index=12}.

Technical Specifications

Parameters

70

Architecture

Autoregressive decoder-only transformer with 70 billion parameters, 80 layers, 64 attention heads, and a hidden dimension of 8192.

Score

8.1

Typical Use Cases

Chatbots and virtual assistants
Content generation
Language translation
Sentiment analysis
Educational tools

Chinchilla

Description

Technical Specifications

Typical Use Cases

Model Information