BERT Tokenizer Calculator

Estimate token counts for BERT and similar models.

Text Input

Note: Token counts are estimates. Actual tokenization depends on the specific vocabulary and text content.

Estimated Tokens

14

of 512 max

πŸ“Words
9
πŸ”€Tokens/Word
1.56

Text Statistics

Character Count44
Characters (no spaces)36
Avg. Chars/Word4.0

Token Details

Special Tokens2
Subword Splits~3
Padding Tokens498
Vocabulary Size30,522
Embedding Memory42.00 KB