Mixture of experts

Also called: MoE

In short

A model architecture that activates only a relevant subset of its parameters for each input, giving large-model capability at lower running cost.

A mixture-of-experts model is split into many sub-networks, or experts, and a router picks a few of them to handle each input. So a model with a huge total parameter count only runs a fraction of itself per request.

The result is large-model capability without large-model cost on every call. Several strong open-weights models use this design.

In LLMWeave

Some of the models available in LLMWeave use a mixture-of-experts design, which is part of why capable models can be offered at a low or even free tier.

Related terms

Try multi-model on your task

One prompt, several models, one answer. Free to start, no card.

Get started

Mixture of experts

In LLMWeave

Open-weights model

Reasoning model

Context window

Try multi-model on your task