docimprint

Evidence bundles — tamper-evident document capture

Bundle structure

Each bundle includes manifest.json listing artifacts with SHA-256 hashes, capture metadata (url, mode, captured_at), mode-specific result fields, and provenance (manifest_sha256, signature, merkle_root when indexed).

The agent envelope returns metadata + artifacts + provenance + result in one JSON response.

Verification

Deep verify re-fetches manifest and artifacts from storage and recomputes hashes. Complete bundles require a platform signature for valid: true; unsigned complete bundles fail. Quick verify (?quick=true) checks signature and manifest status only.

Offline: download ZIP (manifest.json + signature.json), recompute manifest SHA-256, verify signature against GET /v1/keys (active and retired keys).

Citation verify: POST /v1/extract/:id/verify-citation with chunk_id and quoted text returns Merkle proof validity.

Lifecycle controls

Legal hold blocks delete and retention GC (409 LEGAL_HOLD). Notarized bundles require acknowledge_notarized=true to delete (409 BUNDLE_NOTARIZED). Version chains link parent_bundle_id; GET /history returns the chain.

Agent provenance logs and handoffs record what each agent did with a bundle for chain-of-custody audits. Handoffs are application-layer records (not cryptographically signed); hold toggles are audit-logged.

Bundle anatomy

The manifest.json at the root of every evidence bundle contains:

json
{
  "bundle_id": "ev_01j...",
  "manifest_sha256": "sha256:a3f...",
  "signature": "0x...",
  "signer": "0xDocImprintSigner...",
  "captured_at": "2026-06-10T14:32:00Z",
  "source": "https://example.com/contract.pdf",
  "mode": "extract",
  "artifacts": [
    { "name": "document.md",   "sha256": "sha256:b4e...", "size": 42180 },
    { "name": "screenshot.png","sha256": "sha256:c7a...", "size": 312400 },
    { "name": "ocr.txt",       "sha256": "sha256:d2b...", "size": 38900 }
  ],
  "merkle_root": "sha256:e9c...",
  "merkle_version": 2,
  "chunk_count": 47
}

Each artifact SHA-256 is computed over raw file bytes. signature.json contains EIP-191 metadata over manifest_sha256. Signer keys (active and retired) are at GET /v1/keys.

Merkle citation proofs

When DocImprint indexes a document, it splits the text into overlapping chunks, hashes each chunk, and builds a binary Merkle tree. The root is stored in manifest.json as merkle_root.

To verify that a passage came from a specific bundle without re-downloading all artifacts:

bash
# Request a citation proof for a specific chunk
curl -X POST https://api.docimprint.com/v1/extract/ev_abc123/verify-citation \
  -H "Content-Type: application/json" \
  -d '{
    "chunk_id": "chunk_007",
    "text": "The indemnification clause shall survive termination..."
  }'

# Response includes a Merkle proof path:
# {
#   "valid": true,
#   "chunk_hash": "sha256:f1a...",
#   "proof": ["sha256:g2b...", "sha256:h3c...", "sha256:i4d..."],
#   "merkle_root": "sha256:e9c...",
#   "verified_against": "manifest"
# }

To verify offline: hash the chunk text, then iteratively hash with each sibling in proof[] (left or right depending on index parity) until you reach the root. If the computed root matches manifest.json merkle_root, the citation is valid.

On-chain notarization

Notarization anchors a bundle's existence to Base L2 at a specific block timestamp:

bash
curl -X POST https://api.docimprint.com/v1/extract/ev_abc123/notarize \
  -H "X-Payment: <x402-signature>" \
  -H "X-Wallet-Address: 0xYourWallet"
# Returns { "tx_hash": "0x...", "block_number": 12345678, "eas_attestation_uid": "0x..." }

Two mechanisms are used:

  • Calldata — The manifest SHA-256 is written as calldata in a Base transaction. Anyone can decode the calldata and verify it matches the manifest.
  • EAS attestation — An Ethereum Attestation Service attestation is created on Base with the bundle_id and manifest_sha256 (revocable: false). Queryable at easscan.org.

To verify a notarized bundle without trusting DocImprint: retrieve the EAS attestation using the eas_attestation_uid, confirm the attested manifest_sha256 matches the manifest you downloaded, then verify the signature as usual. Block timestamp proves earliest possible existence.

Legal hold workflow

Legal hold prevents deletion or retention garbage collection — useful for litigation, regulatory investigation, or discovery requests:

bash
# Place a bundle on legal hold
curl -X PUT https://api.docimprint.com/v1/extract/ev_abc123/hold \
  -H "Authorization: Bearer dr_live_..." \
  -H "Content-Type: application/json" \
  -d '{"reason": "SEC investigation RE: Form 10-K filing"}'

# Any DELETE or retention expiry will now return 409 LEGAL_HOLD
# The hold reason and timestamp are recorded in the bundle metadata

# Release hold (requires explicit acknowledgment)
curl -X DELETE https://api.docimprint.com/v1/extract/ev_abc123/hold \
  -H "Authorization: Bearer dr_live_..." \
  -H "Content-Type: application/json" \
  -d '{"acknowledge_legal_hold": true}'

For court submissions: download the bundle ZIP, compute manifest SHA-256 locally, and submit the hash to your records system. The notarized bundle provides Base block timestamp and EAS attestation UID as independent time-anchor evidence.

Example

bash
curl https://api.docimprint.com/v1/extract/ev_abc123/verify
# 200 {"valid":true,...} or 409 tamper detected

What is an evidence bundle?

An evidence bundle is the tamper-evident output of POST /v1/extract: manifest.json with SHA-256 artifact hashes, EIP-191 signature, screenshot, PDF, Markdown, OCR, and optional Merkle citation proofs.

How do you verify an evidence bundle offline?

Download the ZIP via GET /v1/extract/{id}/download, recompute SHA-256 on each artifact, and verify the EIP-191 manifest signature against GET /v1/keys. No API key required for verify or download.

Can evidence bundles be anchored on-chain?

Yes. POST /v1/extract/{id}/notarize writes the manifest hash to Base L2. Legal hold via PUT /v1/extract/{id}/hold blocks deletion during litigation.

Related