maubot-plugins/wiki/README.md
Padreug 8f83d8df5e feat(wiki): docs-lookup plugin against Quartz contentIndex
New maubot plugin that points at any Quartz-rendered docs site and
answers chat queries by full-text searching its emitted
/static/contentIndex.json. Default config targets docs.ariege.io
(castle-docs).

Commands:
  !ask <query>            search corpus; top-N hits with snippet + link
  !doc <slug-or-title>    open a specific page (fuzzy title match)
  !wiki / !wiki refresh   status / force re-index

Architecture:
- Periodic fetch (default 10 min) of /static/contentIndex.json
- In-memory inverted-ish scoring: title hit 5pt, content hit 1pt + freq
- No LLM — pure deterministic keyword search; RAG is future Phase 2b
- No DB — index is upstream-derived cache, repopulates on bot restart

Deployment posture: docs.ariege.io is served from cfaun alongside
maubot, so the bot hits it over the host's internal network — works
during WAN outages. base-config.yaml exposes docs_url + index_path
for adopters pointing at their own site.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 16:40:11 +02:00

117 lines
4.3 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# wiki
Documentation-lookup Matrix bot. Points at any
[Quartz](https://quartz.jzhao.xyz/)-rendered docs site, periodically
fetches its `contentIndex.json`, and answers queries in chat.
Designed to be community-portable — works against any Quartz site you
configure it for, not just `docs.ariege.io`. Adjust `docs_url` per
instance.
## Commands
```
!ask <question> # full-text search the docs, top 3 with snippets
!doc <slug-or-title> # open a specific page (exact slug or fuzzy title)
!wiki # status: doc count, last refresh, source URL
!wiki refresh # force re-index now (admin nicety)
```
## Examples
```
!ask how do I shut the water off
!ask alpaca feeding winter
!ask power outage
!doc emergency/water-emergency
!doc water emergency # fuzzy title match works too
!wiki # are we up to date?
```
The bot replies with markdown links to the doc pages, so clicking
through opens the full doc in a browser.
## How it works
Quartz emits `/static/contentIndex.json` as part of its standard build
— a flat `{slug: {title, content, tags}}` map of every published page.
The plugin fetches that file on a timer (default every 10 minutes),
keeps an in-memory inverted index, and scores searches by:
- Title hits: 5 points each
- Content hits: 1 point + 0.1 × frequency
Top N (default 3) results come back with a short snippet around the
first match. **No LLM is involved** in v1 — pure deterministic keyword
search. Phase 2b / future work may add an LLM synthesis step (RAG)
once the inference layer is up.
## Config
`base-config.yaml` (override per maubot instance from the UI):
```yaml
docs_url: https://docs.ariege.io # Quartz site base URL
index_path: /static/contentIndex.json # standard Quartz path
refresh_minutes: 10 # re-fetch cadence
max_results: 3 # !ask hit limit
snippet_chars: 160 # snippet window
site_name: Castle Docs # human-readable label in output
```
For internal-network deployments (the recommended posture — see below),
set `docs_url: http://<internal-hostname>` instead of the public URL.
## Deployment posture (Château du Faune)
Both `docs.ariege.io` and the maubot daemon run on **cfaun**. The bot
hits the docs site over the host's loopback / internal network, so:
- No WAN dependency — the bot works during internet outages
- The fetch is fast (no TLS handshake to the public internet)
- If `docs.ariege.io` is down externally, the bot is unaffected
- Same applies if a future inference node (e.g. a ZeroClaw box) lives
on the internal network: it can hit the same internal URL
If you're deploying elsewhere, point `docs_url` at whichever URL the
bot's host can actually reach.
## Build + iterate
```sh
cd ~/dev/maubot-plugins/wiki
zip -j ../wiki.mbp maubot.yaml base-config.yaml *.py
```
Upload via maubot UI → Plugins → click existing → upload new `.mbp`.
**Hit Save on the instance** after upload (the standard maubot
facepalm). For a new instance, edit the config to point at your docs
site and save.
## Known limitations (v1)
- **No LLM synthesis.** Returns matched passages, not a synthesized
answer. RAG (`!ask` → cited synthesized answer) is the natural Phase
2b enhancement when the inference node is live.
- **Stopwords are minimal.** A query like "how do I" mostly matches
stopwords and may return weak results — phrase queries with the
actual content words ("water shutoff", "winter feeding").
- **No spell correction on content terms.** Title fuzzy match works
for `!doc`; for `!ask` you need to spell the keywords correctly.
- **No personalization.** Everyone in the room sees the same hits.
- **No multi-site support per plugin instance.** One Quartz site per
maubot instance — to serve a second docs source, install a second
instance with a different config.
## Adopting for a different docs site
This plugin is intentionally protocol-agnostic at the content layer —
anything that emits a `{slug: {title, content}}` JSON map will work.
For non-Quartz docs sites, you can either:
1. Adapt the upstream build to emit a compatible `contentIndex.json`
2. Fork this plugin's `_refresh()` to parse your site's index shape
Common alternates worth considering for adopters: MkDocs (with the
mkdocs-material search plugin), Docusaurus, mdBook, or a custom
generator.