Configuration

Every supported environment variable, CLI flag, and configuration field for OpenZIM MCP.

Source of truth: openzim_mcp/config.py and openzim_mcp/defaults.py. If this page disagrees with code, code wins — please file an issue.

How configuration is loaded

OpenZIM MCP is a pydantic-settings BaseSettings model. Every field can be set three ways, in priority order:

  1. CLI flag (where one is wired through __main__.py — currently --mode, --transport, --host, --port)
  2. Environment variable, prefixed with OPENZIM_MCP_
  3. Default value baked into defaults.py

Nested fields use a double-underscore separator: OPENZIM_MCP_CACHE__MAX_SIZE, OPENZIM_MCP_RATE_LIMIT__BURST_SIZE, etc.

A bad value raises OpenZimMcpConfigurationError at startup with a human-readable message — pydantic’s raw ValidationError dump is wrapped before it reaches the operator.

Tool mode

# Default: simple — exposes one natural-language tool (zim_query)
# Set to advanced for the full 8-tool advanced surface
export OPENZIM_MCP_TOOL_MODE=advanced

Or via CLI:

openzim-mcp --mode advanced /path/to/zim/files
ModeTool surfaceWhen to use
simple (default)1 tool: zim_query (NL intent router)small/local LLMs, MCP hosts that struggle with large tool catalogs
advanced8 tools: zim_query, zim_search, zim_get, zim_get_section, zim_browse, zim_metadata, zim_links, zim_health (see API reference)hosts that handle a richer tool surface, scripting, fine-grained control

v2.0.0 ships the 8-tool advanced surface — collapsed from the prior 22-tool v1 surface (Phase F). Advanced-mode wire footprint dropped from ~36KB to ~23.5KB, clearing the MCP Tax pain band (25–50KB schema) for small-model dispatch. See the v1 → v2 migration table if you’re upgrading.

Transport

# Default: stdio (no network)
# Other values: http (streamable HTTP), sse (legacy, localhost-only)
export OPENZIM_MCP_TRANSPORT=http
export OPENZIM_MCP_HOST=127.0.0.1            # default 127.0.0.1
export OPENZIM_MCP_PORT=8000                 # default 8000
export OPENZIM_MCP_AUTH_TOKEN='openssl rand -hex 32'  # required for non-localhost
export OPENZIM_MCP_CORS_ORIGINS='["https://app.example"]'  # JSON list; "*" rejected
FieldEnv varDefaultNotes
auth_tokenOPENZIM_MCP_AUTH_TOKENunsetBearer token for streamable HTTP. Stored as SecretStr; never logged. Set via env only — never put it in a file.
cors_originsOPENZIM_MCP_CORS_ORIGINS[]JSON list of allowed origins. Wildcard "*" is rejected at startup (whitespace-padded " * " too).
hostOPENZIM_MCP_HOST127.0.0.1Bind address. Non-loopback hosts require auth_token for http; sse always rejects non-loopback.
portOPENZIM_MCP_PORT80001-65535.
transportOPENZIM_MCP_TRANSPORTstdioOne of stdio/http/sse. sse has no auth middleware and is loopback-only.

Safe-default startup check — the server refuses to bind if either:

  • transport=http + non-loopback host + no auth_tokenOpenZimMcpConfigurationError: HTTP transport bound to {host} requires authentication.
  • transport=sse + non-loopback host → OpenZimMcpConfigurationError: SSE transport bound to {host} is not allowed.

For full HTTP deployment guidance see HTTP and Docker Deployment.

Resource subscriptions

Polling-based mtime watcher that emits notifications/resources/updated to subscribed clients when allowed directories change or .zim files are replaced.

export OPENZIM_MCP_SUBSCRIPTIONS_ENABLED=true        # default true
export OPENZIM_MCP_WATCH_INTERVAL_SECONDS=5          # default 5, range 1-60

Setting OPENZIM_MCP_SUBSCRIPTIONS_ENABLED=false skips the polling task entirely; subscribe calls succeed but never fire updates. Tune OPENZIM_MCP_WATCH_INTERVAL_SECONDS upward for low-priority watching, downward for sub-second freshness.

Cache

Single LRU+TTL cache shared by all tools and the smart-retrieval path-mapping store.

export OPENZIM_MCP_CACHE__ENABLED=true         # default true
export OPENZIM_MCP_CACHE__MAX_SIZE=100         # default 100, range 1-10000
export OPENZIM_MCP_CACHE__TTL_SECONDS=3600     # default 3600, range 60-86400
export OPENZIM_MCP_CACHE__PERSISTENCE_ENABLED=false              # default false
export OPENZIM_MCP_CACHE__PERSISTENCE_PATH="$HOME/.cache/openzim-mcp"  # default
export OPENZIM_MCP_CACHE__LIBZIM_CLUSTER_CACHE_MAX_SIZE_BYTES=16777216  # default unset (libzim 16 MiB)
export OPENZIM_MCP_CACHE__LIBZIM_DIRENT_CACHE_MAX_COUNT=512             # default unset (libzim 512)
FieldDefaultRange
cache.enabledtruebool
cache.max_size1001-10000
cache.persistence_enabledfalsebool
cache.persistence_path~/.cache/openzim-mcpnormalized to absolute path; falls in a predictable location even when CWD is unpredictable (containers, systemd)
cache.ttl_seconds360060-86400 (1 min - 24 h)
cache.libzim_cluster_cache_max_size_bytesunset (libzim default 16 MiB)0 – 4 GiB, bytes; process-global (libzim’s cluster cache)
cache.libzim_dirent_cache_max_countunset (libzim default 512)0 – 10,000,000, count of dirents; per-archive

These last two are independent of the response cache above: they size libzim’s own reader caches. Leave them unset to keep libzim’s defaults. The cluster cache is sized in bytes and is process-global; the dirent cache is a count of directory entries applied per opened archive. See Performance optimization for tuning guidance.

Cache stats surface inside zim_health under .health.cache_performance — there are no explicit warm_cache/cache_stats/cache_clear tools (restart the server to flush).

Persistence note: when persistence_enabled=true, cache.set() validates that the value is JSON-serializable at write time and raises OpenZimMcpValidationError if not (no silent str() coercion). Internal callers always pass JSON-safe values (strings, dicts, lists, numbers, bools), so this only matters if you’ve patched in a custom caller that stashes a Path, datetime, or other non-JSON object. Pure in-memory caches (persistence off) still accept arbitrary Python objects.

Content

export OPENZIM_MCP_CONTENT__MAX_CONTENT_LENGTH=100000   # default 100000, min 100
export OPENZIM_MCP_CONTENT__SNIPPET_LENGTH=1000         # default 1000, min 100
export OPENZIM_MCP_CONTENT__DEFAULT_SEARCH_LIMIT=10     # default 10, range 1-100

zim_get(entry_path=...) requires max_content_length >= 100 (lower values are rejected with a ToolErrorPayload).

Logging

export OPENZIM_MCP_LOGGING__LEVEL=INFO              # default INFO; one of DEBUG/INFO/WARNING/ERROR/CRITICAL
export OPENZIM_MCP_LOGGING__FORMAT="%(asctime)s - %(name)s - %(levelname)s - %(message)s"  # default

Only level and format are configurable — there is no separate JSON-mode toggle. To emit JSON, supply a JSON format string (or wrap stdout with a JSON-converting handler in your deployment).

Rate limiting

Token-bucket limiter; global + per-operation acquire is atomic (one pass, no transient over-consumption).

export OPENZIM_MCP_RATE_LIMIT__ENABLED=true                # default true
export OPENZIM_MCP_RATE_LIMIT__REQUESTS_PER_SECOND=10.0    # default 10.0
export OPENZIM_MCP_RATE_LIMIT__BURST_SIZE=20               # default 20, max 1000
# Per-operation limits (nested dict, JSON):
export OPENZIM_MCP_RATE_LIMIT__PER_OPERATION_LIMITS='{"search": {"requests_per_second": 4, "burst_size": 8}}'

Per-internal-operation cost defaults are defined in defaults.RATE_LIMIT_COSTS. The v2 tools dispatch internally; each branch resolves to a specific operation key the limiter charges against.

Note on key vocabulary. The internal operation keys below are stable pydantic-validated identifiers — they happen to share their string form with the v1 tool names, but they are config keys, not v1 tool surfaces. The keys remained stable through the Phase F tool collapse so that existing PER_OPERATION_LIMITS overrides in production deployments did not need to be rewritten when the wire-level tool surface changed.

v2 tool callInternal operation key (literal)Cost
zim_search(mode="fulltext")"search"2
zim_search(mode="title")"find_entry_by_title"2
zim_search with namespace/content_type filters"search_with_filters"2
zim_search(mode="suggest")"suggestions"1
zim_get(entry_path=...)"get_entry"1
zim_get(entry_paths=[...])"get_zim_entries" (×N per-entry)1
zim_get(binary=True)"get_binary_entry"3
zim_get(view="structure") / view="toc" / view="summary")"get_structure"1
zim_browse(...)"browse_namespace"1
zim_metadata(...)"get_metadata"1
zim_links(direction="related")"get_related_articles"2
zim_health, zim_get_section, zim_query per-intent"default"1

zim_get(entry_paths=[...]) charges per-entry cost so a batch of 50 doesn’t trivially bypass per-second limits.

Per-operation overrides use the literal key strings shown in the right column. To throttle binary fetches:

export OPENZIM_MCP_RATE_LIMIT__PER_OPERATION_LIMITS='{"get_binary_entry": {"requests_per_second": 1, "burst_size": 2}}'

Server identity

export OPENZIM_MCP_SERVER_NAME=openzim-mcp     # default openzim-mcp; reported in serverInfo

serverInfo.version always reports openzim-mcp’s installed version (read from importlib.metadata), no longer the FastMCP SDK default.

Allowed directories

ZIM file directories. Pass on the CLI:

openzim-mcp /srv/zim /home/user/zim-files

At least one directory is required; each is canonicalized (resolves symlinks and ..) and verified to exist and be a directory at startup. Paths in MCP error responses are redacted to a ...filename.zim form so the canonical layout is never leaked.

Complete reference table

Sorted alphabetically by field name within each grouping.

FieldEnv varDefaultNotes
auth_tokenOPENZIM_MCP_AUTH_TOKENunsetSecretStr, never logged, env-only
cache.enabledOPENZIM_MCP_CACHE__ENABLEDtruebool
cache.libzim_cluster_cache_max_size_bytesOPENZIM_MCP_CACHE__LIBZIM_CLUSTER_CACHE_MAX_SIZE_BYTESunset (libzim 16 MiB)0 – 4 GiB, bytes, process-global
cache.libzim_dirent_cache_max_countOPENZIM_MCP_CACHE__LIBZIM_DIRENT_CACHE_MAX_COUNTunset (libzim 512)0 – 10,000,000, count, per-archive
cache.max_sizeOPENZIM_MCP_CACHE__MAX_SIZE1001-10000
cache.persistence_enabledOPENZIM_MCP_CACHE__PERSISTENCE_ENABLEDfalsebool
cache.persistence_pathOPENZIM_MCP_CACHE__PERSISTENCE_PATH~/.cache/openzim-mcpnormalized absolute
cache.ttl_secondsOPENZIM_MCP_CACHE__TTL_SECONDS360060-86400
content.default_search_limitOPENZIM_MCP_CONTENT__DEFAULT_SEARCH_LIMIT101-100
content.max_content_lengthOPENZIM_MCP_CONTENT__MAX_CONTENT_LENGTH100000min 100
content.snippet_lengthOPENZIM_MCP_CONTENT__SNIPPET_LENGTH1000min 100
cors_originsOPENZIM_MCP_CORS_ORIGINS[]JSON list; * rejected
hostOPENZIM_MCP_HOST127.0.0.1non-loopback requires auth (http) or refuses (sse)
logging.formatOPENZIM_MCP_LOGGING__FORMATstructuredformat string
logging.levelOPENZIM_MCP_LOGGING__LEVELINFODEBUG/INFO/WARNING/ERROR/CRITICAL
portOPENZIM_MCP_PORT80001-65535
rate_limit.burst_sizeOPENZIM_MCP_RATE_LIMIT__BURST_SIZE201-1000
rate_limit.enabledOPENZIM_MCP_RATE_LIMIT__ENABLEDtruebool
rate_limit.per_operation_limitsOPENZIM_MCP_RATE_LIMIT__PER_OPERATION_LIMITS{}nested JSON dict
rate_limit.requests_per_secondOPENZIM_MCP_RATE_LIMIT__REQUESTS_PER_SECOND10.0positive float
server_nameOPENZIM_MCP_SERVER_NAMEopenzim-mcpreported in serverInfo
subscriptions_enabledOPENZIM_MCP_SUBSCRIPTIONS_ENABLEDtruewatcher master switch
tool_modeOPENZIM_MCP_TOOL_MODEsimplesimple or advanced
transportOPENZIM_MCP_TRANSPORTstdiostdio/http/sse
watch_interval_secondsOPENZIM_MCP_WATCH_INTERVAL_SECONDS51-60

Profiles

Local development (stdio)

export OPENZIM_MCP_LOGGING__LEVEL=DEBUG
export OPENZIM_MCP_CACHE__MAX_SIZE=50
export OPENZIM_MCP_CACHE__TTL_SECONDS=1800
openzim-mcp ~/zim-files

Production stdio (e.g. behind a desktop MCP host)

export OPENZIM_MCP_LOGGING__LEVEL=INFO
export OPENZIM_MCP_CACHE__MAX_SIZE=500
export OPENZIM_MCP_CACHE__TTL_SECONDS=14400
export OPENZIM_MCP_CACHE__PERSISTENCE_ENABLED=true
export OPENZIM_MCP_CONTENT__MAX_CONTENT_LENGTH=200000
openzim-mcp /srv/zim

Production HTTP service

export OPENZIM_MCP_TRANSPORT=http
export OPENZIM_MCP_HOST=127.0.0.1
export OPENZIM_MCP_PORT=8000
export OPENZIM_MCP_AUTH_TOKEN="$(openssl rand -hex 32)"
export OPENZIM_MCP_CORS_ORIGINS='["https://app.example.com"]'
export OPENZIM_MCP_LOGGING__LEVEL=INFO
export OPENZIM_MCP_CACHE__MAX_SIZE=500
export OPENZIM_MCP_CACHE__PERSISTENCE_ENABLED=true
export OPENZIM_MCP_RATE_LIMIT__REQUESTS_PER_SECOND=20
export OPENZIM_MCP_RATE_LIMIT__BURST_SIZE=40
openzim-mcp --transport http /srv/zim

Front it with a TLS-terminating reverse proxy (Caddy, nginx, traefik) — there is no built-in TLS.

Memory-constrained (e.g. small VPS, RPi)

export OPENZIM_MCP_CACHE__MAX_SIZE=25
export OPENZIM_MCP_CACHE__TTL_SECONDS=900
export OPENZIM_MCP_CONTENT__MAX_CONTENT_LENGTH=50000
export OPENZIM_MCP_CONTENT__SNIPPET_LENGTH=500
export OPENZIM_MCP_WATCH_INTERVAL_SECONDS=30
openzim-mcp ~/zim-files

Validating configuration

There is no offline --validate flag. Two options:

  1. Start the server with stdio. Bad config raises OpenZimMcpConfigurationError immediately. The error message names the offending field.
  2. Call zim_health from your MCP client. Returns combined health + resolved configuration + loaded archives in one response.
"Show the current server health and configuration"

The response includes (abbreviated):

  • .health — server status, uptime, cache performance, health checks, warnings, recommendations
  • .configuration — resolved values (no secrets; server_pid redacted)
  • .loaded_archives — list of every ZIM file in the allowed directories

Setting environment variables

Linux / macOS:

echo 'export OPENZIM_MCP_CACHE__MAX_SIZE=200' >> ~/.bashrc
source ~/.bashrc

Windows (PowerShell):

$env:OPENZIM_MCP_CACHE__MAX_SIZE = "200"
[Environment]::SetEnvironmentVariable("OPENZIM_MCP_CACHE__MAX_SIZE", "200", "User")

systemd unit:

[Service]
Environment=OPENZIM_MCP_TRANSPORT=http
Environment=OPENZIM_MCP_HOST=127.0.0.1
EnvironmentFile=/etc/openzim-mcp/secrets.env   # OPENZIM_MCP_AUTH_TOKEN here
ExecStart=/usr/local/bin/openzim-mcp /srv/zim

Docker: see HTTP and Docker Deployment.

Stale env vars (not in code)

The following env-var namespaces appeared in pre-1.0 / early-v2 documentation and do not exist in the current codebase. If a tool or example tells you to set them, the source is stale:

  • OPENZIM_MCP_INSTANCE__* — multi-instance conflict tracking was removed entirely.
  • OPENZIM_MCP_SECURITY__* — there is no SecurityConfig. Path validation, input sanitization, and limits are all controlled by the values listed above.
  • OPENZIM_MCP_SMART_RETRIEVAL__* — smart retrieval shares the global cache; there are no dedicated knobs.
  • OPENZIM_MCP_METRICS__* and OPENZIM_MCP_MONITORING__* — no first-party metrics endpoint; use /healthz, /readyz, and zim_health instead.
  • OPENZIM_MCP_SERVER__MAX_CONCURRENT, OPENZIM_MCP_SERVER__REQUEST_TIMEOUT, OPENZIM_MCP_SERVER_DESCRIPTION, OPENZIM_MCP_SERVER__ENABLE_MONITORING — never existed.
  • OPENZIM_MCP_CONTENT__CONVERT_HTML, OPENZIM_MCP_CONTENT__PRESERVE_FORMATTING — content processing is unconditional.
  • OPENZIM_MCP_LOGGING__JSON, OPENZIM_MCP_LOGGING__SECURITY_EVENTS — only level and format are configurable.

Tuning? See Performance Optimization. Deploying over HTTP? See HTTP and Docker Deployment.

v1.x is in maintenance through 2026-11-27. See CHANGELOG for the v1 → v2 migration table.

Edit this page on GitHub ↗