A 123B-parameter multimodal model from Mistral AI (2024) combining text and vision capabilities:contentReference[oaicite:42]{index=42}.