Transform Static ZIM Archives into Dynamic Knowledge Engines
OpenZIM MCP is a modern, secure MCP server that enables AI models to access and search ZIM format knowledge bases offline with intelligent, structured access patterns.
{
  "openzim-mcp": {
    "command": "uv",
    "args": [
      "run", "openzim-mcp",
      "/path/to/zim/files"
    ]
  }
}Why LLMs Love OpenZIM MCP
Unlike basic file readers, OpenZIM MCP provides intelligent, structured access that LLMs need to effectively navigate and understand vast knowledge repositories.
Smart Navigation
Browse by namespace (articles, metadata, media) instead of blind searching. Get structured access to content organization.
Context-Aware Discovery
Get article structure, relationships, and metadata for deeper understanding. Extract links and content connections.
Intelligent Search
Advanced filtering, auto-complete suggestions, and relevance-ranked results with namespace and content type filters.
High Performance
LRU cache with TTL, intelligent eviction policies, and optimized ZIM operations. 90%+ test coverage ensures reliability at scale.
Relationship Mapping
Extract internal/external links to understand content connections. Build knowledge graphs from ZIM content.
Security First
Comprehensive input validation, path traversal protection, and secure resource management with type safety.
๐ง Smart Retrieval System
Advanced intelligent entry retrieval with automatic fallback and path mapping for reliable access to ZIM content.
Direct Access First
Attempts to retrieve entries using the exact path provided, optimizing for speed and accuracy.
Automatic Fallback
When direct access fails, automatically searches using various search terms and path variations.
Path Mapping Cache
Caches successful path mappings to improve performance for repeated access patterns.
Enhanced Error Guidance
Provides clear, actionable guidance when entries cannot be found, designed for LLM users.
How It Works
# The system automatically handles path encoding differences:
# โ
 Direct access: "A/Machine_Learning"
# โ
 Fallback search: "Machine Learning", "machine learning"
# โ
 Cached mapping: Future requests use cached path
# No more manual search-first methodology needed!
get_zim_entry(zim_file, "A/Machine_Learning")๐ Enterprise-Grade Security
Comprehensive security measures designed to protect against vulnerabilities and ensure safe operation in production environments.
Path Traversal Protection
Advanced path validation prevents directory traversal attacks using secure path checking with Python 3.9+ features.
Input Validation & Sanitization
Comprehensive input validation with length limits, character filtering, and sanitization to prevent injection attacks.
Type Safety & Validation
Full type annotations with Pydantic validation ensure data integrity and prevent type-related vulnerabilities.
Secure Error Handling
Sanitized error messages prevent information disclosure while providing helpful guidance for legitimate users.
โก Advanced Enterprise Features
Production-ready capabilities for enterprise deployments, monitoring, and multi-instance environments.
Multi-Instance Management
Automatic instance tracking and conflict detection ensures reliable operation when multiple server instances are running.
- Automatic instance registration with unique process IDs
- Configuration hash validation for compatibility
- Stale instance cleanup and orphaned file detection
- Real-time conflict detection and resolution
Health Monitoring & Diagnostics
Comprehensive health checks and diagnostic tools provide deep insights into server performance and status.
- Built-in health check endpoints
- Cache performance metrics and statistics
- Instance tracking status and recommendations
- Configuration validation and diagnostics
Intelligent Caching System
Advanced LRU cache with TTL support and intelligent eviction policies optimizes performance for large-scale deployments.
- LRU (Least Recently Used) eviction strategy
- Configurable TTL (Time To Live) for entries
- Automatic expired entry cleanup
- Path mapping cache for retrieval optimization
Modern Architecture
Modular design with dependency injection, full type safety, and comprehensive configuration management.
- Dependency injection for testability
- 100% type annotations with mypy validation
- Pydantic-based configuration with validation
- Structured logging with configurable levels
๐ ๏ธ Developer Experience
Modern development workflow with automated releases, comprehensive tooling, and enterprise-grade CI/CD.
Automated Release System
Release-please integration with semantic versioning, automated changelog generation, and PyPI deployment.
Enhanced Makefile Workflow
Comprehensive development workflow with categorized help, security scanning, and cross-platform compatibility.
Comprehensive Testing
90%+ test coverage with pytest, benchmarking, integration tests, and automated quality assurance.
Code Quality Tools
Black formatting, flake8 linting, mypy type checking, bandit security scanning, and pre-commit hooks.
Development Workflow
# Complete development setup in one command
make install
# Run all quality checks
make check
# Run tests with coverage
make test
# Security scanning
make security
# Build and publish
make build && make publishQuick Installation
Get up and running with OpenZIM MCP in just a few minutes.
Install with uv
# Install OpenZIM MCP with uv (recommended)
uv add openzim-mcp
# Or install globally with uv
uv tool install openzim-mcpPrepare ZIM Files
# Create directory for ZIM files
mkdir ~/zim-files
# Download ZIM files from Kiwix Library
# https://browse.library.kiwix.org/Run the Server
# Start the MCP server
uv run openzim-mcp /path/to/zim/files
# Or if installed globally
openzim-mcp /path/to/zim/filesDevelopment Installation
For contributors and developers who want to work with the source code or need the latest features:
# Clone the repository
git clone https://github.com/cameronrye/openzim-mcp.git
cd openzim-mcp
# Install dependencies
uv sync
# Run from source
uv run python -m openzim_mcp /path/to/zim/filesUsage Examples
See OpenZIM MCP in action with real-world examples and API calls.
Search ZIM Files
{
  "name": "search_zim_file",
  "arguments": {
    "zim_file_path": "wikipedia_en_100_2025-08.zim",
    "query": "artificial intelligence",
    "limit": 5
  }
}Response:
Found 42 matches for "artificial intelligence", showing 1-5:
## 1. Artificial Intelligence
Path: Artificial_intelligence
Snippet: Artificial intelligence (AI) is intelligence demonstrated by machines...
## 2. Machine Learning
Path: Machine_learning
Snippet: Machine learning is a subset of artificial intelligence...Browse Namespaces
{
  "name": "browse_namespace",
  "arguments": {
    "zim_file_path": "wikipedia_en_100_2025-08.zim",
    "namespace": "C",
    "limit": 10,
    "offset": 0
  }
}Response:
{
  "namespace": "C",
  "total_in_namespace": 80000,
  "offset": 0,
  "limit": 10,
  "returned_count": 10,
  "has_more": true,
  "entries": [
    {
      "path": "C/Biology",
      "title": "Biology",
      "content_type": "text/html",
      "preview": "Biology is the scientific study of life..."
    }
  ]
}Get Article Structure
{
  "name": "get_article_structure",
  "arguments": {
    "zim_file_path": "wikipedia_en_100_2025-08.zim",
    "entry_path": "C/Evolution"
  }
}Response:
{
  "title": "Evolution",
  "path": "C/Evolution",
  "content_type": "text/html",
  "headings": [
    {"level": 1, "text": "Evolution", "id": "evolution"},
    {"level": 2, "text": "History", "id": "history"},
    {"level": 2, "text": "Mechanisms", "id": "mechanisms"}
  ],
  "sections": [
    {
      "title": "Evolution",
      "level": 1,
      "content_preview": "Evolution is the change in heritable traits...",
      "word_count": 150
    }
  ],
  "word_count": 5000
}MCP Client Configuration
{
  "mcpServers": {
    "openzim-mcp": {
      "command": "uv",
      "args": [
        "run",
        "openzim-mcp",
        "/path/to/zim/files"
      ]
    }
  }
}Alternative: Global Installation
{
  "mcpServers": {
    "openzim-mcp": {
      "command": "openzim-mcp",
      "args": [
        "/path/to/zim/files"
      ]
    }
  }
}Environment Variables (Optional):
# Cache configuration
export OPENZIM_MCP_CACHE__ENABLED=true
export OPENZIM_MCP_CACHE__MAX_SIZE=200
export OPENZIM_MCP_CACHE__TTL_SECONDS=7200
# Content configuration
export OPENZIM_MCP_CONTENT__MAX_CONTENT_LENGTH=200000
export OPENZIM_MCP_CONTENT__SNIPPET_LENGTH=2000
# Logging configuration
export OPENZIM_MCP_LOGGING__LEVEL=INFODocumentation & Resources
Comprehensive guides, API references, and community resources to help you get the most out of OpenZIM MCP.
API Reference
Complete documentation of all available MCP tools, parameters, and response formats.
View API Docs โQuick Start Guide
Step-by-step tutorial to get OpenZIM MCP running in your environment quickly.
Start Tutorial โConfiguration Guide
Advanced configuration options, environment variables, and performance tuning.
Configure โTroubleshooting
Common issues, solutions, and debugging tips for OpenZIM MCP deployment.
Get Help โArchitecture Overview
Deep dive into the system architecture, components, and design decisions.
Learn More โContributing
Guidelines for contributing code, reporting issues, and joining the community.
Contribute โ