Monocle
Document and image text extraction
A job queue manager extracting text and metadata from documents and images using Apache Tika and Tesseract OCR.
- Docker container
- Kubernetes Helm chart
- 5 queue backends
- 4 object-store backends
- 7 notification drivers
- 1 cache engine
- 4 languages (EN, FR, ES, PT)
- REST API + OpenAPI 3.0.3
- Realtime WebSocket channels
- MCP server for AI agents
Monocle turns scanned documents and photos into searchable text. Extract from PDFs, Office docs, and HTML via Apache Tika. OCR images and scans via Tesseract. Redis job queue, S3 storage, and webhook notifications. Perfect for digital archiving, document scanning services, and building searchable libraries from mixed media.
Key features
Extract text from PDFs, Word, Excel, PowerPoint, HTML, and more via Apache Tika
OCR text recognition from images and scanned documents via Tesseract
Metadata extraction to JSON
Redis job queue with webhooks
S3 storage for input and output
Where it goes beyond the obvious
Document extraction and image OCR unified in one service and pipeline
Tech highlights
- Tools: Apache Tika, Tesseract
- Input: PDF, Office, HTML, images, scanned documents
- Storage: S3/MinIO
- Queue: Redis with webhooks
Built on
REST API surface
- POST /push Enqueue extraction (file or image)
- GET /results Extraction jobs
- GET /tasks extract-text-from-file, extract-text-from-image
- GET /storage/list Browse output .txt/.json
- GET /storage/download Download extracted text
- WS /realtime Live progress
Full spec at GET /openapi — Swagger UI at /swagger/
Backends you can actually pick from.
This service speaks the backends below natively. Swap with a single environment variable.
Queues
- Redis
- RabbitMQ
- SQS
- Kafka
- STOMP
Cache
- Redis
Object storage
- S3
- MinIO
- Azure Blob
- Local
Notifications
- Slack
- Discord
- Teams
- FCM
- APNs
- SNS
- WebPush
Use cases
Digital archiving converting physical documents to searchable text
Legal platforms extracting text from contracts and evidence
Libraries digitising mixed-media collections
E-discovery platforms processing document batches
Accessibility services converting images to text
Monocle vs AWS Textract, Azure Document Intelligence, Google Cloud Document AI
Text extraction and OCR without Azure Document Intelligence fees
Architecture patterns featuring this service
More in media
Crunch
Media
A job queue manager that batch-processes images — converts, resizes, compresses, watermarks, applies effects, then stores results back to S3.
Gofer
Media
A job queue manager converting Office documents (PPTX, ODP) to PDF and image packages using Gotenberg plus image optimizers.
Greenlight
Media
A job queue manager converting screenwriting formats (FDX, Fountain, FadeIn, PDF) to and from ScreenJSON with validation and AES-256 encryption.
Deploy Monocle. Today.
One Docker image. One compose stack. One afternoon to production. Monocle is waiting.