Native Ruby gem for parsing documents (PDF, DOCX, XLSX, images with OCR) with zero runtime dependencies. Statically links MuPDF for PDF extraction and Tesseract for OCR.
Required Ruby Version
>= 3.4, < 3.5.dev
Authors
Chris Petersen
Versions
- 0.1.2.1 January 15, 2026 (19 KB)
- 0.1.2.1 January 15, 2026 x86_64-linux (7.01 MB)
- 0.1.2.1 January 15, 2026 arm64-darwin-23 (5.78 MB)
- 0.1.2.1 January 15, 2026 aarch64-linux (6.84 MB)
- 0.1.2 November 07, 2025 (19 KB)
- 0.1.2 November 07, 2025 aarch64-linux (6.81 MB)