Mixture of Experts Calculator

Calculate MoE model parameters and efficiency.

MoE Configuration

Hidden Size

Number of Layers

Total Experts

Active Experts (K)

Expert Intermediate Size

Vocabulary Size

Shared Expert Ratio

Precision

Total Parameters

47.5B

13.7B active per forward

💾Total Memory

88.5 GB

🎯Expert Utilization

25%

Params per Expert176.2M

Params per Layer1.48B

Attention Params/Layer67.1M

Router Params/Layer32.8K

Parameter Efficiency28.8%

Memory Overhead247.1%

Equivalent Dense Model13.7B

Equiv. Dense Memory25.5 GB

Routed Experts8

Shared Experts0

Tokens per Expert16%

Tip: MoE models like Mixtral 8x7B have 47B total params but only ~13B active, giving dense-model quality at much lower compute cost.

💡

How would you rate the Mixture of Experts Calculator?

MyCalcBuddy Editorial Team

This page is maintained as an educational calculator reference.

ðŸ“š

Formula Source: Standard Mathematical References

by Various

ðŸ”„Last reviewed: May 2026

âœ“Formula checks are based on standard references and internal QA review.