All posts

How to Cite Archived Pages Correctly

April 16, 2026

An archived page is only useful if you can reference it clearly and defensibly. A bad citation sends readers to a 404, leaves verifiers guessing about which version you meant, or quietly substitutes the live page for the snapshot. Getting citations right is less about formatting and more about capturing the right metadata at the moment you find a source.

There are five fields worth recording every time you cite an archived page: the original URL, the archive's snapshot URL, the snapshot date (this is the date the page was captured, not the date you accessed the archive), the page title, and the publisher or author when known. If you collect these consistently, fitting them into any citation style — MLA, APA, Chicago, AP — becomes mechanical.

The snapshot URL is the part most people get wrong. A Wayback Machine URL contains the capture timestamp directly: https://web.archive.org/web/20230415120000/https://example.com/page. The fourteen-digit string is the timestamp in YYYYMMDDHHMMSS format. Cite the full URL, not a shortened or canonicalized version, and never substitute the live page URL just because it still resolves. The whole point of an archive citation is that you are pointing to a specific version that may no longer exist on the live web.

For Wayback Machine citations, most major styles now accept a format along the lines of: *Author or organization. "Page Title." Original URL. Archived [snapshot date] at [snapshot URL].* The order varies by style guide; the substance does not. APA wants the snapshot date as the version date, Chicago treats it as part of the access information, and journalism style sheets typically prefer inline parenthetical attribution.

Common mistakes are worth flagging because they all degrade the citation in different ways. Linking to the live page when you mean the snapshot lets the source change underneath you. Omitting the snapshot date makes it impossible to verify which version you read. Citing only the archive root (web.archive.org/web/*/example.com) without a specific timestamp forces the reader to guess. And using shortened links — bit.ly, t.co, anything proxied — introduces a second point of failure that may rot before the archive itself does.

Arkibber makes the metadata-capture step lighter, which is where citations usually break down. Pulling the original URL, snapshot URL, capture date, and title together at the point of discovery means assembling final citations is a matter of formatting, not archaeology.

For high-stakes work — legal filings, investigative journalism, academic publication — capture the page in two archives if you can. Submit it to the Wayback Machine and to Archive.today. Cite both. Archives can fail or change their interfaces, and a single point of dependency on any single service is a risk you do not need to take. Belt and suspenders is the right posture for citations that need to outlive the platforms that host them.

Using Archive.org to Reconstruct Deleted Content
Building Research Workflows with the Internet Archive