Skip to content
Adi's Digital Garden

Provenance

Active

Irrefutable proof that writing was created by a human — through an authentic human process.

Try the demo →

The Vision

AI can now generate text indistinguishable from human writing in quality. Existing detectors don't work — they analyze the output, but the output isn't what makes human writing human.

The analogy I keep coming back to is live concerts. Studio recordings are technically perfect, but there's something irreplaceable about proof that a human actually performed it — in time, with friction. As AI floods the zone with perfect output, the process of creation becomes the differentiator.

Speed and single-burst generation are what make AI effective. Mistakes, creative detours, and rumination are what make us human. Provenance captures and proves those human characteristics — not by analyzing what you wrote, but by recording how you wrote it.

The Event Recorder

Provenance runs a CodeMirror 6 editor with a thin bridge layer (editorRecorder.js) that intercepts every document change event and pipes it into the core recorder. Nothing is sampled or batched at write-time — every atomic edit is captured.

Event Types Captured

  • insert — character(s) typed
  • delete — character(s) removed
  • paste — content pasted from clipboard
  • session_start — new writing session began
  • session_end — writing session ended

Per-Event Metadata

  • Millisecond-precision Unix timestamp
  • Cursor position in document
  • Content inserted or length deleted
  • SHA-256 hash (chained from prior event)

All recorder operations — startSession, recordInsert, endSession — are async and serialized through an internal operation queue. This prevents race conditions when multiple events fire in rapid succession (fast typists can produce several events per millisecond), ensuring the hash chain is always computed in strict order.

How the Proof Is Generated

Each event's hash is computed over its own content plus the previous event's hash, forming a rolling SHA-256 chain:

event[0].hash = SHA256(event[0] + "")
event[1].hash = SHA256(event[1] + event[0].hash)
event[2].hash = SHA256(event[2] + event[1].hash)
...

When you're done writing, everything gets serialized into a .provenance file — a portable, self-contained JSON artifact with all sessions, all events, the final content, and a SHA-256 hash of the final document. No server. No Provenance infrastructure. Anyone can verify independently by replaying the chain.

File Format (excerpt)

{
  "version": "1.0.0",
  "sessions": [{
    "id": "session-uuid",
    "startTime": "2024-01-15T09:00:00Z",
    "events": [
      { "type": "insert", "timestamp": 1705312800000,
        "position": 0, "content": "H", "hash": "abc123..." },
      { "type": "delete", "timestamp": 1705312805000,
        "position": 0, "length": 1,  "hash": "def456..." }
    ]
  }],
  "finalContent": "The complete document...",
  "contentHash": "sha256-of-final-content"
}

Why Forgery Is Hard

Rolling Hash Chain

Modifying a single event invalidates every hash that follows it. There's no way to silently edit the record — any tampering breaks the chain. Verification requires no central authority; the math is self-contained in the file.

Behavioral Capture

Timing patterns, pause durations, and correction sequences are recorded at millisecond precision. Human typing has statistically distinctive bursts, hesitations, and backtrack patterns that are difficult to fabricate convincingly.

Multi-Day Sessions

Forging a proof means performing the writing process in real time, across multiple actual sittings. The longer the document, the higher the cost. A forged 3,000-word essay would require hours of simulated typing spread over multiple days.

Paste Detection

Paste events are recorded separately and flagged in the replay. The post-processing pipeline distinguishes external pastes from internal rearrangements (cut+paste within the document) by replaying events against a parallel character-origin tracker.

Tech Stack

  • Editor: CodeMirror 6 — extensible, excellent event access
  • Runtime: Node.js + Express
  • Frontend: Vanilla JS + Vite
  • Storage: File System Access API (Chrome/Edge) → local vault folder
  • Hashing: SHA-256 via Web Crypto API
  • Tests: Vitest + jsdom

What's Next

  • Statistical fingerprinting — WPM variance, pause distributions, quantitative authenticity signal
  • Verification badge — embeddable widget linking to the proof
  • Cryptographic anchoring — RFC 3161 timestamping to prove the proof wasn't fabricated retroactively
  • Privacy modes — statistical proof without full replay