toolserver/README_PDF.md
2026-02-23 16:39:46 +01:00

958 B

PDF endpoints in toolserver

This build adds PDF ingestion + page text + page rendering (+ optional embedded image extraction) and indexing into Chroma / Meilisearch.

Enable / config (env)

  • ENABLE_PDF=1
  • PDF_STORE_DIR=/data/pdf_store
  • PDF_MAX_MB=80
  • MEILI_PDF_INDEX=pdf_docs (optional; used by /pdf/{doc_id}/index when add_meili=true)

PyMuPDF is required:

  • pip install pymupdf
  • import name: fitz

Endpoints

  • POST /pdf/ingest (multipart/form-data, field: file) -> {doc_id, n_pages, sha256}

  • GET /pdf/{doc_id}/text?page=1&mode=blocks|text|dict -> blocks includes bbox + text

  • GET /pdf/{doc_id}/render?page=1&dpi=200 -> image/png (cached on disk)

  • GET /pdf/{doc_id}/images?page=1 -> list of embedded images (xref ids)

  • GET /pdf/{doc_id}/image/{xref} -> download embedded image

  • POST /pdf/{doc_id}/index body: PdfIndexRequest -> chunks to Chroma + optional Meili, supports extra_text_by_page for vision captions.