BEINECKE MS 408

37,465 tokens 7,598 types 226 folios

Corpus Overview

Full corpus glyph rendering under 4-slot grammar (P·G·C·S).

Top 30 Tokens

Prefix Distribution

Suffix Family Distribution

Section Sizes

Folio Browser



Token Frequency






EVA → Glyph Alphabet

VoynichEVA TrueType font. Install, type EVA, change font — glyphs appear.

Basic Characters

Common Composites

Live Preview

Live Statistics

Compute corpus-level and positional metrics on any scope.

Compare VMS Scopes

Side-by-side statistical comparison of any two folios or sections.



Upload & Compare

Upload a .txt or .docx file, tokenise it, then run the full stats engine and compare against any VMS scope.
Drop a .txt or .docx file here, or click to browse
Text will be processed entirely in your browser — nothing leaves your device.

Tokenisation Options

Cross-Transliteration Comparison

Run the 80-metric engine on multiple transliterations for the same scope and compare where statistical profiles diverge.

Zipf Plot

Log-log rank-frequency plot. A straight line indicates power-law (Zipfian) distribution.


Word Length Distribution

Token count by character length. Voynichese has a distinctively tight word-length distribution.


N-Gram Explorer

Browse character n-grams with frequency and positional distribution (start/mid/end of word).



N-Gram Progression Heatmap

Frequency of n-grams across folios in reading order. Each row is an n-gram, each column a folio. Filter to a single n-gram to see its distribution.





Vocabulary Growth Curve

Cumulative unique types vs tokens in folio reading order. Overlay shows Heaps' law fit.

KWIC Concordance

Keyword in Context — see each match with surrounding tokens, aligned on the search term.



Co-Occurrence Matrix

Character co-occurrence heatmap — reveals phonotactic constraints within Voynichese words.


Word Trace

Trace a specific word or pattern across the entire manuscript. See where it appears, how it clusters by section, and which folios have the highest concentration.


Transliteration Agreement

Compare where transliterations agree or disagree on each line. Tokens are highlighted by agreement level.



Stroke-Level AI Analysis

Top-disagreement tokens for this folio. Click Analyze to crop the glyph from the Yale IIIF image and ask Gemini to evaluate the strokes.