MothRag

Answers to the questions
one search can't reach.

MothRag is built for the hard questions — the ones whose answer is spread across many documents, where ordinary AI search misses it. It connects the evidence and shows the reasoning behind every answer — deterministically, so the same question gives the same answer every time, not a different roll of the dice. New information is added just by embedding it — there's no knowledge graph to rebuild — so it stays accurate on data that changes every day. And it runs entirely on the LLM APIs you already pay for: no GPUs, no model hosting, no lock-in.

Research-grade accuracy Stays current as data changes No GPUs, no model hosting Runs on any LLM API Shows its work

Production AI search stops at one lookup. Real questions don't.

Most AI search does a single lookup and stops. That breaks the moment a question spans multiple documents, chains entities, or compares facts across sources — exactly the questions that matter most in real knowledge work. MothRag is built for that case, and ships as a Python package you point at any LLM API you already use.

Built for the hard questions.

It connects facts across many sources to answer the multi-hop questions a single search can't.

No infrastructure to run.

No GPUs to rent, no models to host, no special infrastructure — it runs on the standard LLM APIs you already use.

You can see why it answered.

Every answer comes with the evidence and the reasoning trail behind it — so you can check it, not just trust it. The difference between a demo and something you can put in front of customers.

Frontier quality — finally something you can actually deploy.

F1 on HotpotQA, 2WikiMultiHopQA and MuSiQue: MothRag (commodity APIs, no GPU) outscores RAPTOR, GraphRAG and HippoRAG 2 on every benchmark

Against the most popular RAG systems, MothRag outscores every one on every benchmark, on commodity APIs alone.

F1 on HotpotQA, 2WikiMultiHopQA and MuSiQue: MothRag (commodity APIs, no GPU) vs HippoRAG 2, CoRAG and NeocorRAG

F1 across three multi-hop benchmarks. MothRag reaches research-lab parity using only commodity LLM APIs, no GPU.

Accuracy

Frontier accuracy, deployable.

On the standard multi-hop benchmarks, MothRag beats the graph-based systems (HippoRAG, GraphRAG) outright and matches the best published research overall. The difference: the systems at this level need datacenter GPUs or non-commercial models — MothRag reaches it on commodity APIs alone.

Infrastructure

Runs on commodity APIs.

No GPU fleet, no hosted models, no special infrastructure — MothRag runs entirely on the standard LLM APIs you already pay for.

Portability

No vendor lock-in.

Works with any model, today's or tomorrow's. Swap the engine underneath without retraining anything.

Transparency

Answers that show their work.

Every answer is structured as an inspectable reasoning trail over the evidence it used, with a built-in agreement signal across its internal reasoning paths — so you can gauge confidence at a glance.

Freshness

Stays current as data changes.

New information is added just by embedding it — there's no knowledge graph to rebuild and nothing to retrain. Graph- and training-based systems re-index every time the data moves; MothRag doesn't.

Measured at research scale — and honest about it.

Validated across three standard multi-hop benchmarks at the same scale researchers use (1000 evaluations each, Llama-3.3-70B reader): on par with the published research frontier — while running entirely on the standard LLM APIs you already use. Full numbers and methodology in the published paper.

Quality: On par with research SOTA
Benchmarks: 3 standard, multi-hop
Scale: 1000 evals each
Reader: Llama-3.3-70B
Hardware: Any LLM API, no GPU

Open source. Paper published.

MothRag is open source on GitHub and installable from PyPI with pip install mothrag. It ships a one-command CLI: run mothrag demo for an instant multi-hop answer over a bundled corpus, then point mothrag query at your own documents. The paper documenting the method and results is published on Zenodo.

Answers to the questions one search can't reach.