Resilient Citations

A fragile citation is a citation with a mutable source, that is, source text that can be modified or deleted after the citation is published. Referencing a conventional web URL (such as in the traditional “Retrieved at” format) would be a fragile citation since the webpage may be changed, the server may go down, the domain name may expire, etc.

A resilient citation, in contrast, is a citation with an immutable source, or permanent link to its source text.

Great strides are being made towards a permanent web with technologies such as IPFS, Arweave, and Swarm. Such a Web would effectively make all published resources resilient. These are early technologies however. The timeline for their development and scaling is unknown, and they may be supplanted by other technologies before reaching maturity. In the mean time, what is a reasonable solution to the fragile citation problem?

Instead of attempting to globally solve the problem of fragile citations for all documents, we can downgrade our requirement to a given document with citations. Thus the problem becomes, how can a citation preserve its source text as long as the citing document exists. This is more tractable, as the permanence of the source is conditional and isolated to only the citations used.

This may be achieved in two steps:

  1. Archive the source text with the citing document. This provides the conditional permanence that depends only on the preservation of the citing document.
  2. Prove the authenticity of the source text using TLSNotary. This effectively proves that the source was retrieved without tampering.

I suggest that the HTTP Range header would be a useful way to delimit the specific span of source text cited and reduce bandwidth requirements. Unfortunately this is not in widespread use nor enabled on platforms such as WordPress. In lieu of Range support at the source, a proxy server could be used to retrieve the full resource and then re-serve it with Range support, with the appropriate notarization to prove its authenticity.

By publishing source text with the citing document and including a TLSNotary proof of retrieval, resilient citations become possible using technologies readily available today. This forms the infrastructure for a deeply linked web of collective sensemaking with or without the use of Semantic Web or Linked Data mechanisms of machine-readability.

Leave a Reply

Your email address will not be published. Required fields are marked *