merlyn/collector/utils
Timothy Carambat 2a9066e83a
OCR PDFs as fallback during upload (#3204)
* OCR PDFs as fallback in spawn thread

* wip

* build our own worker fanout and wrapper

* norm pkgs

* bump dev
2025-02-14 11:57:31 -08:00
..
comKey [BETA] Live document sync (#1719) 2024-06-21 13:38:50 -07:00
EncryptionWorker [BETA] Live document sync (#1719) 2024-06-21 13:38:50 -07:00
extensions chore: rename Github to GitHub (#3199) 2025-02-13 10:45:43 -08:00
files autodetect parseable text file contents (#3079) 2025-01-31 13:31:26 -08:00
http Document Processor v2 (#442) 2023-12-14 15:14:56 -08:00
logger patch logger for full logs 2024-07-19 18:35:41 -07:00
OCRLoader OCR PDFs as fallback during upload (#3204) 2025-02-14 11:57:31 -08:00
tokenizer Add tokenizer improvments via Singleton class and estimation (#3072) 2025-01-30 17:55:03 -08:00
url Allow 127.0.0.1 as valid URL for scraping (#2560) 2024-10-31 09:57:28 -07:00
WhisperProviders Audio file validations (#2902) 2024-12-30 14:48:28 -08:00
constants.js Support XLSX files (#2403) 2024-10-03 13:45:23 -07:00