docimprint

Aly Sawft · Founder & Engineer, Sawftware LLC · · 8 min read

What is the Model Context Protocol and why document tools need it

What MCP is

The Model Context Protocol (MCP) is an open standard developed by Anthropic that defines how AI agents discover and call tools. Before MCP, every AI framework had its own tool-calling convention: LangChain tools, OpenAI function calling, Claude tool_use — each with different schemas, different discovery mechanisms, and different semantics.

MCP standardizes the interface. An MCP server exposes tools with:

An MCP client (any MCP-compatible agent: Claude, LangChain, Cursor, Zed, and others) can discover and call any MCP server without custom integration code. The agent learns what tools are available, chooses the right one for the task, and calls it with structured parameters.

The analogy is REST for APIs: just as REST standardized how servers expose resources over HTTP, MCP standardizes how tools are exposed to AI agents. You build the tool once; every MCP-compatible agent can use it.

Why document tools specifically need MCP

Document intelligence — extracting, analyzing, verifying, and searching documents — is one of the most common tasks AI agents are asked to perform. But document APIs are historically complex:

Without MCP, wiring a document API into an agent requires custom integration code per framework: a LangChain tool definition, a Claude tool_use schema, an OpenAI function spec. Each needs to be maintained separately and updated when the API changes.

With MCP, you define the tools once on the server. Every MCP-compatible agent uses them without additional integration work. The agent can discover the full document intelligence capability surface — all 14 DocImprint tools — from a single endpoint.

DocImprint's 14 MCP tools

The DocImprint MCP server exposes document intelligence as a complete tool suite. Each tool has a machine-readable description, input schema, and structured output:

Extract tools:

Analysis tools:

Collection tools:

jsonConfigure DocImprint MCP in Claude Desktop
// claude_desktop_config.json
{
  "mcpServers": {
    "docimprint": {
      "command": "npx",
      "args": ["mcp-remote", "https://api.docimprint.com/mcp"],
      "env": {
        "AUTHORIZATION": "Bearer dr_live_..."
      }
    }
  }
}

MCP vs raw HTTP for agent integrations

DocImprint exposes the same capabilities over MCP and REST. When should an agent use MCP vs direct HTTP?

Use MCP when:

Use raw HTTP when:

Both paths hit the same Cloudflare Workers backend. MCP tools are thin wrappers over the REST endpoints — no capability gap between protocols. An agent configured with MCP can call extract_text; a backend service can call POST /v1/extract with identical results.

For production agent marketplaces, MCP is increasingly the default discovery layer. For enterprise integrations with existing API gateways, REST with API keys remains common. DocImprint supports both without forcing a choice.

How tool discovery works

When an MCP client connects to the DocImprint MCP server, it calls the list_tools method. The server returns descriptions and schemas for all 14 tools. The agent stores these in its working memory for the session.

When the user (or orchestrating agent) asks a question, the agent selects the appropriate tool. The selection is based on the tool description — this is where writing good descriptions matters. DocImprint's tool descriptions specify not just what the tool does, but when to use it:

This guidance helps the agent choose correctly without being explicitly told which tool to use for each request.

A typical agent workflow over MCP

Here is a complete workflow showing how an agent uses DocImprint MCP tools to answer a compliance question:

  1. User: "Do all of our supplier contracts from Q1 2026 include a data processing addendum?"

  2. Agent calls create_collection (or uses an existing collection ID) with the collection of Q1 supplier contracts.

  3. Agent calls ask_collection with the question "Does this contract include a data processing addendum or DPA clause?"

  4. The MCP server runs vector search across the indexed bundles, retrieves relevant chunks, generates an answer, and returns: a verdict per bundle (yes/no/unclear), the cited passages with chunk_ids, and the bundle_ids where the relevant clauses were found.

  5. Agent calls check_claims on any ambiguous bundles: "This contract includes a data processing addendum" — to get a structured supported/contradicted/not_found verdict with evidence.

  6. Agent calls verify_bundle on the bundles before presenting the final answer — confirming integrity before citing them.

  7. Agent returns a complete answer: which contracts include a DPA, which do not, with citations traceable to specific paragraphs in each contract.

The entire workflow runs through MCP tool calls. The agent does not need to know anything about DocImprint's HTTP API. It uses tools the way a human uses a word processor — pick the right one for the task, use it, move to the next step.

A2A and OpenAPI: the broader discovery stack

MCP is one layer in DocImprint's agent discovery stack. The full stack:

MCP (https://api.docimprint.com/mcp): tool-oriented interface for document intelligence operations. Best for agents that use tools interactively.

A2A agent card (https://api.docimprint.com/.well-known/agent.json): Google's Agent-to-Agent protocol. Declares DocImprint as an agent with capabilities, endpoints, and authentication requirements. Used by agent orchestrators that route tasks between agents.

OpenAPI 3.1 (https://api.docimprint.com/openapi.json): machine-readable HTTP spec for the full REST API. Used by code generation tools, SDK generators, and agents that prefer raw HTTP over MCP.

SKILL.md (https://api.docimprint.com/SKILL.md): a plain-text capability declaration that LLMs can read directly. Describes what DocImprint does, when to use it, and how to call it — without any protocol overhead.

Together these four endpoints mean any agent, regardless of which discovery protocol it uses, can find and use DocImprint without manual configuration.

Setting up DocImprint MCP: two minutes

Connecting DocImprint MCP to Claude Desktop or any MCP-compatible agent requires one configuration block and an API key.

Get an API key from docimprint.com/get-started. Then add the server configuration to your MCP client. For Claude Desktop, edit the config file at ~/Library/Application Support/Claude/claude_desktop_config.json on macOS.

The mcp-remote package (from npm) proxies remote MCP servers over stdio, which is what most desktop MCP clients expect. The AUTHORIZATION environment variable passes your API key to the remote server.

After restarting Claude Desktop, you should see DocImprint's tools in the tool list. Ask Claude "what DocImprint tools do you have?" to confirm discovery. Then try: "summarize this PDF" with a URL — Claude will call extract_text or summarize_document automatically.

bashVerify MCP connection and list available tools
# Test the MCP server directly (requires mcp-remote or a local MCP client)
# Or use the REST API equivalent to verify your key works:
curl https://api.docimprint.com/v1/summarize \
  -H "Authorization: Bearer dr_live_..." \
  -H "Content-Type: application/json" \
  -d '{"source":"https://example.com/annual-report.pdf"}'

# Once MCP is configured in Claude Desktop:
# "Summarize the key findings in this annual report: https://example.com/annual-report.pdf"
# Claude will call summarize_document automatically

What comes next for MCP and documents

MCP is young — the spec was released in late 2024 and the ecosystem is growing quickly. Several developments are worth watching for document intelligence specifically:

Streaming tool responses: current MCP tools return a complete response synchronously. For large document extractions, streaming (returning partial results as they are available) would significantly improve perceived latency. This is on the MCP roadmap.

Tool chaining in the protocol: today, agents chain tools by calling them sequentially. Future MCP versions may support declarative tool chains, letting the server express "after extract_text, you probably want check_claims" — reducing agent reasoning overhead.

Persistent sessions and context: an agent currently cannot "remember" a collection ID across MCP sessions without storing it externally. Protocol-level session state would make multi-turn document workflows more natural.

Cross-agent tool sharing: if Agent A creates a collection, Agent B should be able to use it without re-authenticating. This is an open problem in multi-agent MCP deployments — wallet-based identity (x402) is part of the answer.

The combination of MCP discovery + x402 payment + evidence bundles represents a complete protocol stack for trustworthy, economically self-sustaining AI agent document workflows. That is the direction the agent economy is moving.

Related