API reference

OpenZIM MCP exposes three kinds of MCP surfaces:

SurfaceCount (advanced mode)Default mode
Tools (callable functions)8only zim_query exposed in Simple mode
Prompts (slash-command workflows)3always available
Resources (URI-addressable data)3 templates + subscriptionsalways available

In Simple mode (the default) only the zim_query natural-language tool is exposed. Pass --mode advanced (or set OPENZIM_MCP_TOOL_MODE=advanced) to expose all 8 specialized tools below.

v2.0.0 collapsed the prior 22-tool advanced surface into 8 consolidated tools. The full mechanical v1 → v2 mapping is reproduced in the migration table at the bottom of this page; it also lives in CHANGELOG.md.

Source of truth: openzim_mcp/tools/. If signatures here disagree with code, file an issue — code is canonical.

Output format

Tools return one of:

  • Structured response objects (most tools) — typed payloads (SearchResponse, EntryBundle, MetadataResponse, etc.) that the MCP transport serializes. Clients should parse the JSON envelope.
  • Markdown-string responseszim_get (single-entry mode) returns a rendered string with Title:, Path:, Type: lines, then a ## Content block.
  • Tool errors — every tool catches exceptions and returns a structured ToolErrorPayload ({operation, message, hint?}) rather than raising. Path entries inside errors are redacted to ...filename.zim form so the canonical allowed-directory layout is never leaked.

Simple mode

zim_query

Single natural-language tool exposed by default. Routes to the underlying advanced operations via an intent parser.

Signature:

zim_query(
    query: str,
    zim_file_path: Optional[str] = None,
    limit: Optional[int] = None,
    offset: int = 0,
    content_offset: int = 0,
    cursor: Optional[str] = None,
    max_content_length: Optional[int] = None,
    compact: bool = True,
    compact_budget: Optional[Union[str, int]] = None,
    synthesize: bool = False,
) -> Union[str, SynthesizeResponse, ToolErrorPayload]
ParameterTypeDefaultNotes
querystring(required)Natural-language question or instruction
zim_file_pathstringNoneAuto-selects when only one ZIM is in the allowed dirs
limitintNoneMax results for search/browse intents
offsetint0Pagination offset
content_offsetint0Pagination within long article body
cursorstringNoneOpaque cursor for resuming paginated results
max_content_lengthintNoneMax characters for retrieved articles
compactboolTrueCompact prose rendering (token-budget aware)
compact_budgetstr | intNoneOverride the compact-mode budget
synthesizeboolFalseWhen True, return a SynthesizeResponse (multi-source briefing)

Intents recognized (incomplete list):

  • “list available ZIM files”
  • “search for biology in wikipedia.zim”
  • “get article Evolution”
  • “show structure of Biology”
  • “browse namespace C with limit 10”
  • “search all files for python”
  • “walk namespace M”
  • “find article titled Photosynthesis”
  • “articles related to Climate_Change”
  • “summary of Quantum_mechanics”

Returns: intent-specific output; for searches a markdown list of results, for retrievals a rendered article, for “synthesize” intents a structured SynthesizeResponse.


Advanced tools (8)

Full-text / title / suggest search dispatch. Collapses five v1 search tools (search_zim_file, search_all, search_with_filters, find_entry_by_title, get_search_suggestions) into one.

zim_search(
    query: str,
    mode: Literal["fulltext", "title", "suggest"] = "fulltext",
    zim_file_path: Optional[str] = None,
    cross_file: bool = False,
    namespace: Optional[str] = None,
    content_type: Optional[str] = None,
    limit: Optional[int] = None,
    offset: int = 0,
    cursor: Optional[str] = None,
) -> Any
ParameterNotes
queryRequired search term, title, or partial-query prefix (depending on mode)
mode"fulltext" (default; libzim full-text index), "title" (title-indexed lookup with fast C/<Title> path), or "suggest" (auto-complete prefix)
zim_file_pathOptional when cross_file=True; otherwise required (unless only one ZIM in allowed dirs)
cross_fileWhen True, queries every allowed ZIM file. Multi-archive merges into per_file_results
namespace, content_typeOptional filters (only meaningful for mode="fulltext")
limit1–100 for fulltext/title, 1–50 for suggest
cursorOpaque cursor from a prior result; resumes the search where it left off

Returns: mode-shaped response — SearchResponse / SearchAllResponse / SearchWithFiltersResponse / FindEntryResponse / SearchSuggestionsResponse — or ToolErrorPayload on validation failure. Subsequent pages include next_cursor until exhausted.


zim_get

Single-entry / batch / binary / main-page / view-mode entry fetch. Collapses seven v1 retrieval tools into one.

zim_get(
    zim_file_path: str,
    entry_path: Optional[str] = None,
    entry_paths: Optional[List[str]] = None,
    view: Literal["full", "summary", "toc", "structure"] = "full",
    binary: bool = False,
    main_page: bool = False,
    max_content_length: Optional[int] = None,
    content_offset: int = 0,
    compact: bool = False,
    compact_budget: Optional[Union[str, int]] = None,
) -> Any

Exactly one of these four branch selectors must be set:

BranchSelectorReturns
Single-entry (article body)entry_path="..."Markdown string with Title: / Path: / ## Content
Single-entry view modesentry_path="..." + view="summary" / "toc" / "structure"Structured response (summary / TOC tree / headings)
Batchentry_paths=[...] (up to 50){results, succeeded, failed} — per-entry success/error
Binaryentry_path="..." + binary=True{path, title, mime_type, size, encoding, data} (base64)
Main pagemain_page=TrueArchive main page entry

The four branches are mutually exclusive. Setting more than one (e.g. entry_path + entry_paths, or main_page + view="summary") returns a ToolErrorPayload with operation="invalid_path_combination".

ParameterNotes
view"full" (default; article body), "summary" (opening paragraph), "toc" (hierarchical TOC), "structure" (headings + section anchors). Ignored when binary=True or main_page=True
binaryWhen True, returns raw bytes (base64) with native MIME type. Default per-entry cap 10 MiB
max_content_lengthPer-entry char cap, min 100
content_offsetPage through long articles without re-fetching the prefix
compactCompact-mode prose (default False at v2.0 — preserves legacy byte-identical behavior). v2.5 will revisit the default with telemetry
compact_budgetOptional budget override for compact mode

Smart retrieval: if direct path access fails, single-entry mode falls back to search-derived candidate terms; resolved paths are cached. When fallback resolves to a different path, the response shows both Requested Path: and Actual Path:.


zim_get_section

Section-level fetch by section ID. Renamed from the v1 get_section tool; the new compact=True default is the only behavioral change.

zim_get_section(
    zim_file_path: str,
    entry_path: str,
    section_id: str,
    max_chars: Optional[int] = None,
    compact: bool = True,
    compact_budget: Optional[Union[str, int]] = None,
) -> Any
ParameterNotes
section_idRequired; the heading ID or anchor from a prior zim_get(view="toc") or zim_get(view="structure") response
max_charsPer-section char cap
compactDefault True. v2.0 surface-uniformity parameter — a no-op at the data layer because the bundle is always compact-rendered. v2.5 #18 will wire the real raw-text path
compact_budgetOptional budget override

Returns: structured response with section body, heading metadata, and adjacent-section hints.


zim_browse

Namespace browse / walk dispatch. Collapses the v1 browse_namespace + walk_namespace tools.

zim_browse(
    zim_file_path: str,
    namespace: str,
    mode: Literal["page", "walk"] = "page",
    cursor: Optional[str] = None,
    limit: Optional[int] = None,
    offset: int = 0,
) -> Any
ModeBehavior
"page" (default)Sampled namespace overview, paginated by limit + offset. For very large namespaces may cap entries — use mode="walk" for exhaustive iteration
"walk"Cursor-paginated deterministic iteration by entry ID. Pair next_cursor with a follow-up call until done: true
ParameterRangeNotes
namespaceC, M, W, X, A, I for legacy schemes; domain-style names for modern archives
limit1–500Default 50 (page) / 200 (walk)
cursor, offsetcursor only valid with mode="walk"; offset only valid with mode="page"

Returns: {entries, next_cursor, done} for walk, or a sampled JSON list for page.


zim_metadata

Combined archive metadata + namespaces. Collapses the v1 get_zim_metadata + list_namespaces tools — the response now includes both the M-namespace metadata and the deterministic namespace breakdown.

zim_metadata(zim_file_path: str) -> Any

Returns: structured response with:

  • metadata — archive M-namespace fields (title, language, creator, flavour, date, etc.).
  • namespaces — a deterministic namespace breakdown (surfaces minority namespaces — M, W, X, I — that random sampling could miss).
  • archive_identity{uuid, is_multipart}, the libzim archive identity (added in v2.1).
  • index_capabilities{has_fulltext_index, has_title_index}: whether full-text search and title suggestions will work against this archive (added in v2.1).
  • counter_breakdown{mimetype: count} parsed from the M/Counter metadata, so you can profile an archive’s content composition without walking it. Omitted when the archive has no M/Counter entry (added in v2.1).

Outbound / related link-graph dispatch. Collapses the v1 extract_article_links + get_related_articles tools.

zim_links(
    zim_file_path: str,
    entry_path: str,
    direction: Literal["outbound", "related"] = "outbound",
    cursor: Optional[str] = None,
    limit: Optional[int] = None,
    offset: int = 0,
) -> Any
DirectionBehavior
"outbound" (default)All internal + external links extracted from the article body. Drops non-navigable schemes (javascript:, mailto:, tel:, data:, blob:, vbscript:)
"related"Outbound link-graph neighbors with deduplication

direction="inbound" is reserved for v2.5 (lands with the link-graph sidecar).

Relative hrefs are resolved against the source entry’s directory; redirects are followed to resolved paths; the content namespace is identified correctly on domain-scheme archives; self-referential refs are rejected.

Returns: {outbound_links: [...], internal_count, external_count, media_count} (outbound) or {related: [...]} (related).


zim_health

Two calls in one. With no argument, returns combined server health, configuration, and loaded archives (collapses the v1 get_server_health + get_server_configuration + list_zim_files tools). With a zim_file_path, validates and diagnoses that one archive instead (added in v2.1).

zim_health(zim_file_path: Optional[str] = None) -> Any
ArgumentBehavior
(omitted)Combined server-state report: {health, configuration, loaded_archives}.
zim_file_pathPer-archive integrity/identity check via libzim — runs Archive.check() and reports checksum, index capabilities, and identity. Lets a caller tell a valid archive from a corrupt one.

Server-state response (no argument) — shape (abbreviated):

{
  "health": {
    "timestamp": "2026-05-27T15:30:00.000000",
    "status": "healthy",
    "server_name": "openzim-mcp",
    "uptime_info": { "process_id": "[REDACTED]", "started_at": "..." },
    "cache_performance": { "hits": 1024, "misses": 256, "hit_rate": 0.8 },
    "health_checks": { "directories_accessible": 1, "zim_files_found": 5, "permissions_ok": true },
    "recommendations": [],
    "warnings": []
  },
  "configuration": {
    "server_name": "openzim-mcp",
    "allowed_directories": ["...zim-files"],
    "cache_enabled": true,
    "cache_max_size": 100,
    "tool_mode": "advanced",
    "transport": "stdio",
    "config_hash": "<sha256>",
    "server_pid": "[REDACTED]"
  },
  "loaded_archives": [
    { "name": "wikipedia_en_100_2026-02.zim", "path": "...wikipedia_en_100_2026-02.zim", "size": 124857600, "modified": "2026-02-15T10:30:00" }
  ]
}

process_id / server_pid are always "[REDACTED]" — diagnostic output frequently lands in bug reports. Allowed directories are shown as ...basename to avoid leaking the canonical layout.

Archive-validation response (with zim_file_path, added in v2.1):

{
  "is_valid": true,
  "has_checksum": true,
  "checksum": "<hex>",
  "has_fulltext_index": true,
  "has_title_index": true,
  "uuid": "<archive uuid>",
  "is_multipart": false,
  "path": "...archive.zim",
  "name": "archive.zim"
}

is_valid is the result of libzim’s Archive.check() structural-integrity probe — a quick way to tell a valid archive from a corrupt or truncated one. A non-indexed archive reports has_fulltext_index: false; full-text zim_search against it then degrades gracefully to a no_xapian_index reason instead of erroring.


MCP prompts

Three slash-command workflows. See openzim_mcp/tools/prompts.py.

User-supplied arguments are sanitized: ASCII control characters are replaced with spaces, backticks are stripped (template delimiter), and the value is capped at 200 characters before being interpolated. Apostrophes and double quotes are preserved (real entry paths contain them, e.g. C/Schrödinger's_cat).

/research

research(topic: str)

Workflow: zim_search(query=topic, cross_file=True) across archives, then zim_get(entry_path=..., view="summary") on the top hits, then ask the user which thread to pursue.

/summarize

summarize(zim_file_path: str, entry_path: str)

Workflow: zim_get(view="toc")zim_get(view="summary")zim_links(direction="outbound"), combined into a TL;DR + section list + 5–10 most relevant outbound links.

/explore

explore(zim_file_path: str)

Workflow: zim_metadatazim_get(main_page=True)zim_browse(namespace="C", mode="walk", limit=5). Produces a compact briefing.

If a prompt is invoked without required args (or args reduce to empty after sanitization), the response asks the user to supply them.


MCP resources

Three URI templates. See openzim_mcp/tools/resource_tools.py.

zim://files

JSON list of every ZIM file in the allowed directories. Same shape as the loaded_archives field of zim_health.

zim://{name}

Overview of one ZIM file: metadata, namespace breakdown, and main-page preview (truncated to 2000 characters). {name} is the bare basename without .zim (e.g. wikipedia_en_climate_change_mini_2024-06).

zim://{name}/entry/{path}

Single entry served with native MIME type:

  • HTML / text entries → text/html, text/plain, application/json, etc., body as text.
  • Binary entries (images, PDFs) → appropriate MIME, body as raw bytes (FastMCP base64-wraps).

Encoding requirement: clients MUST URL-encode / as %2F in the {path} segment because FastMCP’s URI template engine treats / as a segment separator. Example:

zim://wikipedia_en/entry/A%2FClimate_change

A literal slash will fail to route. See the Resources, prompts & subscriptions guide for full details.


Resource subscriptions

Clients can subscribe to zim://files or zim://{name} and receive notifications/resources/updated whenever:

  • A .zim file is added to or removed from an allowed directory (zim://files)
  • A specific .zim file’s mtime changes (zim://{name})

Configuration:

Env varDefaultNotes
OPENZIM_MCP_SUBSCRIPTIONS_ENABLEDtruemaster switch
OPENZIM_MCP_WATCH_INTERVAL_SECONDS51–60

See Resources, prompts & subscriptions for full client-side examples.


Rate limiting

All tools are subject to a global token-bucket limiter (default 10 req/s, burst 20). Costs are charged per internal operation, not per tool call — a v2 tool that dispatches over multiple modes resolves to a specific underlying operation key:

Tool callInternal operationCost
zim_search(mode="fulltext") or zim_search(mode="title")search / find_entry_by_title2
zim_search(mode="suggest")suggestions1
zim_search(cross_file=True)charged per archive scannedvaries
zim_get(entry_path=...)get_entry1
zim_get(entry_paths=[...])get_zim_entries (per-entry charge)N
zim_get(binary=True)get_binary_entry3
zim_get(view="structure") / view="toc" / view="summary"get_structure1
zim_browse(mode="page") or zim_browse(mode="walk")browse_namespace1
zim_metadataget_metadata1
zim_links(direction="related")get_related_articles2
zim_links(direction="outbound")default1
zim_health, zim_get_section, zim_query (per-intent)default1

Tune via OPENZIM_MCP_RATE_LIMIT__REQUESTS_PER_SECOND, OPENZIM_MCP_RATE_LIMIT__BURST_SIZE, and OPENZIM_MCP_RATE_LIMIT__PER_OPERATION_LIMITS. See Configuration.

When the limit is exceeded, the tool returns a ToolErrorPayload with a retry_after hint; it does not raise.


Error responses

Every tool wraps exceptions and returns a structured ToolErrorPayload:

{
  "status": "error",
  "operation": "invalid_path_combination",
  "message": "Set exactly one of: entry_path, entry_paths, binary, main_page",
  "hint": "Use entry_paths=[...] for batch fetch"
}

The error classes in openzim_mcp/exceptions.py are the canonical source for the underlying exception hierarchy:

  • OpenZimMcpError — base
  • OpenZimMcpConfigurationError
  • OpenZimMcpValidationError
  • OpenZimMcpArchiveError
  • OpenZimMcpRateLimitError

Absolute filesystem paths in error messages are redacted to ...filename.zim form. PIDs are redacted in diagnostics output. Error text in v2.0 is safe to copy into bug reports.


v1 → v2 migration

The full mechanical mapping. Every v1 tool name in this table is intentional — it is the canonical place to look up “what does my old call become?”. For the narrative context see CHANGELOG.md → migration table.

v1 callv2 equivalentNotes
list_zim_files()zim_health().loaded_archiveshealth/config/files consolidated
get_server_health()zim_health().healthhealth/config/files consolidated
get_server_configuration()zim_health().configurationhealth/config/files consolidated
get_zim_metadata(path)zim_metadata(path).metadatanow includes namespace breakdown too
list_namespaces(path)zim_metadata(path).namespacesnow includes namespace breakdown too
get_main_page(path)zim_get(path, main_page=True)one of four mutually-exclusive branches
search_zim_file(path, q)zim_search(q, zim_file_path=path)default mode="fulltext"
search_all(q)zim_search(q, cross_file=True)multi-archive merge in per_file_results
search_with_filters(path, q, ns=, ct=)zim_search(q, zim_file_path=path, namespace=ns, content_type=ct)filters only meaningful for fulltext
find_entry_by_title(path, title)zim_search(title, zim_file_path=path, mode="title")fast title-indexed C/<Title> path
get_search_suggestions(path, prefix)zim_search(prefix, zim_file_path=path, mode="suggest")autocomplete-style prefix
get_zim_entry(path, entry_path)zim_get(path, entry_path=entry_path)smart-retrieval fallback on miss
get_zim_entries(path, entries)zim_get(path, entry_paths=entries)up to 50 per call; per-entry cost
get_binary_entry(path, entry_path)zim_get(path, entry_path=entry_path, binary=True)base64 wire payload, 10 MiB default cap
get_entry_summary(path, entry_path)zim_get(path, entry_path=entry_path, view="summary")one of four view modes
get_table_of_contents(path, entry_path)zim_get(path, entry_path=entry_path, view="toc")one of four view modes
get_article_structure(path, entry_path)zim_get(path, entry_path=entry_path, view="structure")one of four view modes
get_section(path, entry_path, section_id)zim_get_section(path, entry_path, section_id)now defaults compact=True
browse_namespace(path, namespace)zim_browse(path, namespace)default mode="page"
walk_namespace(path, namespace)zim_browse(path, namespace, mode="walk")cursor-paginated deterministic iteration
extract_article_links(path, entry_path)zim_links(path, entry_path)default direction="outbound"
get_related_articles(path, entry_path)zim_links(path, entry_path, direction="related")replaces standalone tool

There are no on-the-wire aliases at v2.0 — old tool names disappear cleanly per the foundational v2 decisions.


Need configuration help? See Configuration. Deploying over HTTP? See HTTP and Docker Deployment. Using resources / subscriptions? See Resources, prompts & subscriptions.

v1.x is in maintenance through 2026-11-27. See CHANGELOG for the v1 → v2 migration table.

Edit this page on GitHub ↗