feat(wiki): docs-lookup plugin against Quartz contentIndex
New maubot plugin that points at any Quartz-rendered docs site and answers chat queries by full-text searching its emitted /static/contentIndex.json. Default config targets docs.ariege.io (castle-docs). Commands: !ask <query> search corpus; top-N hits with snippet + link !doc <slug-or-title> open a specific page (fuzzy title match) !wiki / !wiki refresh status / force re-index Architecture: - Periodic fetch (default 10 min) of /static/contentIndex.json - In-memory inverted-ish scoring: title hit 5pt, content hit 1pt + freq - No LLM — pure deterministic keyword search; RAG is future Phase 2b - No DB — index is upstream-derived cache, repopulates on bot restart Deployment posture: docs.ariege.io is served from cfaun alongside maubot, so the bot hits it over the host's internal network — works during WAN outages. base-config.yaml exposes docs_url + index_path for adopters pointing at their own site. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
b7a096a77a
commit
8f83d8df5e
5 changed files with 382 additions and 0 deletions
117
wiki/README.md
Normal file
117
wiki/README.md
Normal file
|
|
@ -0,0 +1,117 @@
|
|||
# wiki
|
||||
|
||||
Documentation-lookup Matrix bot. Points at any
|
||||
[Quartz](https://quartz.jzhao.xyz/)-rendered docs site, periodically
|
||||
fetches its `contentIndex.json`, and answers queries in chat.
|
||||
|
||||
Designed to be community-portable — works against any Quartz site you
|
||||
configure it for, not just `docs.ariege.io`. Adjust `docs_url` per
|
||||
instance.
|
||||
|
||||
## Commands
|
||||
|
||||
```
|
||||
!ask <question> # full-text search the docs, top 3 with snippets
|
||||
!doc <slug-or-title> # open a specific page (exact slug or fuzzy title)
|
||||
!wiki # status: doc count, last refresh, source URL
|
||||
!wiki refresh # force re-index now (admin nicety)
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
```
|
||||
!ask how do I shut the water off
|
||||
!ask alpaca feeding winter
|
||||
!ask power outage
|
||||
!doc emergency/water-emergency
|
||||
!doc water emergency # fuzzy title match works too
|
||||
!wiki # are we up to date?
|
||||
```
|
||||
|
||||
The bot replies with markdown links to the doc pages, so clicking
|
||||
through opens the full doc in a browser.
|
||||
|
||||
## How it works
|
||||
|
||||
Quartz emits `/static/contentIndex.json` as part of its standard build
|
||||
— a flat `{slug: {title, content, tags}}` map of every published page.
|
||||
The plugin fetches that file on a timer (default every 10 minutes),
|
||||
keeps an in-memory inverted index, and scores searches by:
|
||||
|
||||
- Title hits: 5 points each
|
||||
- Content hits: 1 point + 0.1 × frequency
|
||||
|
||||
Top N (default 3) results come back with a short snippet around the
|
||||
first match. **No LLM is involved** in v1 — pure deterministic keyword
|
||||
search. Phase 2b / future work may add an LLM synthesis step (RAG)
|
||||
once the inference layer is up.
|
||||
|
||||
## Config
|
||||
|
||||
`base-config.yaml` (override per maubot instance from the UI):
|
||||
|
||||
```yaml
|
||||
docs_url: https://docs.ariege.io # Quartz site base URL
|
||||
index_path: /static/contentIndex.json # standard Quartz path
|
||||
refresh_minutes: 10 # re-fetch cadence
|
||||
max_results: 3 # !ask hit limit
|
||||
snippet_chars: 160 # snippet window
|
||||
site_name: Castle Docs # human-readable label in output
|
||||
```
|
||||
|
||||
For internal-network deployments (the recommended posture — see below),
|
||||
set `docs_url: http://<internal-hostname>` instead of the public URL.
|
||||
|
||||
## Deployment posture (Château du Faune)
|
||||
|
||||
Both `docs.ariege.io` and the maubot daemon run on **cfaun**. The bot
|
||||
hits the docs site over the host's loopback / internal network, so:
|
||||
|
||||
- No WAN dependency — the bot works during internet outages
|
||||
- The fetch is fast (no TLS handshake to the public internet)
|
||||
- If `docs.ariege.io` is down externally, the bot is unaffected
|
||||
- Same applies if a future inference node (e.g. a ZeroClaw box) lives
|
||||
on the internal network: it can hit the same internal URL
|
||||
|
||||
If you're deploying elsewhere, point `docs_url` at whichever URL the
|
||||
bot's host can actually reach.
|
||||
|
||||
## Build + iterate
|
||||
|
||||
```sh
|
||||
cd ~/dev/maubot-plugins/wiki
|
||||
zip -j ../wiki.mbp maubot.yaml base-config.yaml *.py
|
||||
```
|
||||
|
||||
Upload via maubot UI → Plugins → click existing → upload new `.mbp`.
|
||||
**Hit Save on the instance** after upload (the standard maubot
|
||||
facepalm). For a new instance, edit the config to point at your docs
|
||||
site and save.
|
||||
|
||||
## Known limitations (v1)
|
||||
|
||||
- **No LLM synthesis.** Returns matched passages, not a synthesized
|
||||
answer. RAG (`!ask` → cited synthesized answer) is the natural Phase
|
||||
2b enhancement when the inference node is live.
|
||||
- **Stopwords are minimal.** A query like "how do I" mostly matches
|
||||
stopwords and may return weak results — phrase queries with the
|
||||
actual content words ("water shutoff", "winter feeding").
|
||||
- **No spell correction on content terms.** Title fuzzy match works
|
||||
for `!doc`; for `!ask` you need to spell the keywords correctly.
|
||||
- **No personalization.** Everyone in the room sees the same hits.
|
||||
- **No multi-site support per plugin instance.** One Quartz site per
|
||||
maubot instance — to serve a second docs source, install a second
|
||||
instance with a different config.
|
||||
|
||||
## Adopting for a different docs site
|
||||
|
||||
This plugin is intentionally protocol-agnostic at the content layer —
|
||||
anything that emits a `{slug: {title, content}}` JSON map will work.
|
||||
For non-Quartz docs sites, you can either:
|
||||
|
||||
1. Adapt the upstream build to emit a compatible `contentIndex.json`
|
||||
2. Fork this plugin's `_refresh()` to parse your site's index shape
|
||||
|
||||
Common alternates worth considering for adopters: MkDocs (with the
|
||||
mkdocs-material search plugin), Docusaurus, mdBook, or a custom
|
||||
generator.
|
||||
Loading…
Add table
Add a link
Reference in a new issue