Privacy-first semantic search for researchers who work with serious text collections.
Works with Zotero, Obsidian, Calibre, and plain folders. Runs entirely on your machine.
Open source (MIT) · v0.9 Beta · Python 3.11+
Cloud-based tools demand a high price: your privacy. You have no control over where your data goes, generated answers lack verifiable citations, and generic models fail when confronted with rare, specialized, or unpublished academic materials. Your years of careful curation—annotations, tags, reading notes—are invisible to them.
We bring the intelligence to your data, not the other way around. Everything runs locally on your machine. Every answer links directly to the exact passage in your source. It works with the tools you already use—Zotero, Obsidian, Calibre, or just a folder of PDFs—respecting your workflow without demanding migration.
Your books never leave your machine. No cloud uploads, no telemetry, no tracking. Full sovereignty over your research data.
Every result links to the precise page, chapter, and section in your source. Hallucinations are for dreamers, not researchers.
12 tools for Claude Desktop via Model Context Protocol. Your AI assistant searches your library directly. ChatGPT support in development.
Zotero, Obsidian vaults, Calibre libraries, or plain folders of PDFs. Auto-detected, no configuration. No migration required.
Search in German, English, Latin, Greek, French—or all at once. BGE-M3 multilingual embeddings understand meaning across languages.
Tags, annotations, highlights, reading notes, custom fields—all indexed and searchable. Years of curation become a research superpower.
Set ARCHILLES_LIBRARY_PATH to your Zotero library, Obsidian vault, Calibre folder, or any directory.
Batch-index by tag, author, or entire library. 30+ formats. GPU-accelerated. Resume anytime.
Hybrid semantic + keyword search with RRF fusion. Optional cross-encoder reranking for precision.
Get results with page numbers, chapters, and sections. Export as BibTeX, RIS, or Markdown.
ARCHILLES speaks Model Context Protocol (MCP)—the open standard for AI tool integration. Connect it to Claude Desktop and ask questions in natural language.
# In Claude Desktop, just ask:
"Search my books for discussions of political legitimacy in early modern Europe"
"What did I highlight about medieval trade routes?"
"List all books by Hannah Arendt in my library"
"Export my Philosophy books as BibTeX"
"Set my research interests to: prosopography, late antique senators"
12 MCP tools available: search, annotations, metadata, bibliography export, research interests, duplicate detection, and more. Currently optimized for Claude Desktop (free tier available). HTTP/SSE transport for ChatGPT, Codex, and other clients is in active development.
Semantic understanding meets keyword precision. RRF fusion combines both approaches in every query.
PDF chapters and sections from the table of contents. EPUB structure preserved. Running headers and footers removed automatically.
Optional second-stage reranker scores each query-document pair for significantly improved relevance.
Register project-specific keywords once. They boost every subsequent search—no re-indexing required.
Bibliography, index, and front matter excluded by default. No more noise from reference lists drowning out actual content.
BibTeX, RIS, EndNote, JSON, CSV. Filter by author, tag, or year. One command from the CLI or one prompt in Claude.
Tesseract integration for older monographs and dissertations. VLM-based OCR upgrade path prepared.
NVIDIA CUDA, Apple Silicon MPS, or CPU. Hardware-adaptive profiles handle everything from a laptop to a workstation.
"Find all discussions of trade routes between Mediterranean and Northern Europe before 1500"
Searches across Latin primary sources, German monographs, and English translations simultaneously.
"Trace the motif of unreliable narrators across these 50 twentieth-century novels"
Finds passages that demonstrate the concept, even when the texts never use that term.
"Compare views on the hard problem of consciousness across Chalmers, Dennett, and Nagel"
Precise name matching meets semantic understanding of philosophical concepts.
"Find all precedents on liability for AI-generated decisions across my EU law collection"
Commentary, case law, and regulatory texts searched simultaneously. Custom fields like jurisdiction work as filters.
Translate entire books, clean up PDF text, or proofread OCR output. Supports Gemini, OpenAI, and Claude APIs. Runs entirely in your browser—your API keys never touch our servers.
Three modes: Passage for quick translations, Document for DOCX/EPUB files, Proofread for spelling correction without style changes.
ARCHILLES is open source and in active development. We're looking for researchers from diverse disciplines with substantial personal libraries (500+ titles) to help shape the tool.
No spam. Purely academic updates and beta access.