docshelf-mcp
Put your manuals on a shelf, hand the AI the index.
An MCP server that turns a folder of PDFs and Markdown into a chat-project-friendly document collection.
AI agents see a single INDEX.md and pull individual sections by raw GitHub URL on demand — instead of choking on a 4 MB datasheet.
The problem
You have 30 hardware manuals, or 200 cooking recipes, or a stack of research PDFs. You want Claude / ChatGPT to answer questions across them — but:
- ❌ You can’t dump 80 MB of PDFs into a chat project. It won’t fit, and you’d burn the context window even if it did.
- ❌ Copy-pasting the relevant pages works only after you remember which manual mentioned the thing.
- ❌ Long files mean wasteful retrieval — the model loads a whole RouterOS guide just to answer about VLANs.
The fix
docshelf-mcp turns any folder of documents into a navigable shelf:
- Convert — PDFs → clean Markdown via
pymupdf4llm. - Split — large manuals split by chapter into 1–10 KB sections.
- Index — auto-generated
INDEX.mdlists every document with raw GitHub URLs. - Fetch on demand — AI reads the small index, then fetches only the section it needs over HTTPS.
You drop the 5 KB INDEX.md into your AI project. The 50 MB of source stays on GitHub.
Install
pip install docshelf-mcp
Or via Smithery-style MCP config in your Claude Desktop:
{
"mcpServers": {
"docshelf": {
"command": "docshelf-mcp",
"env": { "DOCSHELF_ROOT": "/path/to/your/shelf" }
}
}
}
Quick start
from docshelf_mcp import Shelf
shelf = Shelf("/path/to/your/repo")
shelf.add_document("manuals/router.pdf", category="network", title="Mikrotik RouterOS")
shelf.rebuild_index()
Or via MCP tools from inside a Claude chat:
docshelf_add_document(...)— convert + split + index a new filedocshelf_rebuild_index()— regenerate the navigation pagedocshelf_search(query, max_results=10)— grep across all sectionsdocshelf_list_documents(category=None)— catalog viewdocshelf_convert_pdf(pdf_path, out_dir, quality="fast")— one-shot conversion
Use cases
- 🏠 Homelab manuals — Mikrotik, Cudy, ASUS, Intel datasheets. AI answers “how do I configure VLAN trunks on RouterOS?” with the exact section quoted.
- 🍲 Cooking recipes — a folder of 200 family recipes. AI suggests dinner based on what’s in your fridge.
- 📚 Research papers — a stack of arXiv PDFs. AI synthesizes findings across them.
- 🧑🏫 Course materials — lectures, slides, homework. AI helps students find specific topics.
- 📑 Compliance documentation — internal SOPs, audit reports. AI surfaces the relevant policy.
Try it on a real shelf
There’s a live, public docshelf at https://github.com/ignatenkofi/gh.project.homelab — a homelab manuals collection covering routers, switches, NICs, PSUs, RAM, NAS, racks, and more.
Open the INDEX.md
— that’s the only file a chat project needs. From there an AI agent can
follow links into chapter SUBINDEXes for the big manuals (RouterOS, X550)
or fetch the small per-device files directly.
A simplified INDEX.md looks like this:
# My Shelf — Index
## 🌐 Network gear
- [Router admin manual](https://raw.githubusercontent.com/you/shelf/main/docs/network/router-admin.md)
- [Switch configuration guide](https://raw.githubusercontent.com/you/shelf/main/docs/network/switch.md)
## 🖥 Hardware datasheets (split by chapter)
### NIC datasheet
- [Chapter 1 — Overview](https://raw.githubusercontent.com/you/shelf/main/docs/hardware/nic/01-overview.md)
- [Chapter 8 — Device registers](https://raw.githubusercontent.com/you/shelf/main/docs/hardware/nic/08-device-registers.md)
- [Chapter 9 — PCIe register map](https://raw.githubusercontent.com/you/shelf/main/docs/hardware/nic/09-pcie-register-map.md)
…
The AI sees a few KB of structure and the raw URLs. When asked
“how do I configure PCIe BARs?”, it fetches only
09-pcie-register-map.md and answers from that — not the whole 4 MB
original PDF.
Resources
- 📦 PyPI —
pip install docshelf-mcp - 🐙 GitHub repo — source, issues, contributing
- 🌐 Glama listing — install button + security score
- 📋 Project prompts — ready-to-paste instructions for Claude / ChatGPT / API
- 📖 Architecture — how it works internally
License
MIT — © ignatenkofi