Why Vestige exists
There is an unreasonable amount of text left behind by the people who shaped the world. Diaries kept nightly, letters sent weekly, sermons and speeches and marginalia and notebooks of sums and daydreams. Most of it sits unread.
We built Vestige because reading an archive cold is hard, and because the form of a question is often how a new reader finds their way in. If you want to know what Lincoln thought about grief, or what Marcus Aurelius thought about the opinion of strangers, the answers are there — they just aren't indexed in a form that makes them easy to meet.
Vestige is a conversational interface to those archives. You ask, it retrieves, it renders in the voice of the original. The goal isn't to entertain, though it often does; the goal is to make a centuries-old record newly approachable, and to put the reader back in front of the primary text.
How responses are constructed
Every conversation on Vestige runs the same pipeline:
- Retrieval. The question is matched against a semantic index of the figure's surviving writings. Relevant passages — usually three to eight — are pulled into context.
- Temporal lock. Whatever date is set on the timeline, only writings up to that point are eligible. Lincoln on 1 April 1861 cannot cite his own Gettysburg Address. This is enforced before retrieval, not after.
- Persona grounding. A system prompt specific to the figure — written from their documented values, era, and idiom — tells the model to speak in their voice and to refuse anachronism.
- Generation. Claude (by Anthropic) writes the response using the retrieved passages as evidence. Sources are surfaced with each reply so you can read the original.
- Voice. Optional text-to-speech uses a voice profile tuned to period and accent — American plains for Lincoln, Received Pronunciation for Austen, and so on.
When a figure's archive doesn't cover a topic, the correct behavior is to say so rather than improvise. That boundary is the most important one in the system.
Corpus sources
Every text used to ground a subject's responses is either in the public domain or released under a permissive license. The main wells we draw from:
- Project Gutenberg — the backbone. Tens of thousands of English-language public-domain texts, many of them the definitive editions of the figures we cover.
- Wikisource — correspondence, speeches, public-domain pamphlets, and official documents often not in Gutenberg.
- Internet Archive — out-of-copyright scans, in particular for figures whose collected works predate modern digital editions.
- Founders Online (National Archives) — for American founding-era figures, with full provenance metadata.
- University correspondence archives (Darwin Correspondence Project, etc.) for out-of-copyright letters.
- Wikimedia Commons — public-domain portraiture.
Each subject's specific source manifest lives in their page and is available on request. We double-check copyright status on ingestion; if you believe a source is used incorrectly, please write to us.
What Vestige is not
Not a seance
We're not contacting the dead. A figure's "voice" is a model reading their surviving text. When no text supports an answer, there is no answer.
Not ghost-writing
Vestige replies should not be quoted as if they were the figure's own words. They paraphrase. The underlying sources are the quotable thing — and we link to them.
Not a replacement for scholarship
A historian reading Austen's letters in sequence will see things Vestige won't surface. Use this as a gateway, not a substitute.
Credits
Model providers
Archives & source texts
- Project Gutenberg
- Wikimedia Commons — portraits
- Internet Archive / archive.org
- Wikisource
- Founders Online and numerous university correspondence archives.
Open-source tools
- Node.js, Express, and the surrounding npm ecosystem.
- Sentence embeddings and vector indices for retrieval.
- MinIO for storage, Coolify for deployment.
Contact
For press, corrections, partnerships, or to tell us an archive we missed:
[email protected]