merlyn/collector
Timothy Carambat fd4929b4d2
Feature/drupalwiki collector (#3693)
* Implement DrupalWiki collector

* Add attachment downloading and processing functionality (#3)

* linting

* Linting
Add citation image
small refactors
add URL for citation identifier

---------

Co-authored-by: em <eugen.mayer@kontextwork.de>
Co-authored-by: rexjohannes <53578137+rexjohannes@users.noreply.github.com>
Co-authored-by: Eugen Mayer <136934+EugenMayer@users.noreply.github.com>
2025-04-21 09:17:24 -07:00
..
extensions Feature/drupalwiki collector (#3693) 2025-04-21 09:17:24 -07:00
hotdir Document Processor v2 (#442) 2023-12-14 15:14:56 -08:00
middleware [BETA] Live document sync (#1719) 2024-06-21 13:38:50 -07:00
processLink Add querySelectorAll capability to web-scraping block (#3186) 2025-02-13 16:11:15 -08:00
processRawText Add tokenizer improvments via Singleton class and estimation (#3072) 2025-01-30 17:55:03 -08:00
processSingleFile feat: Add multilingual support for ocr module (#3325) 2025-02-27 12:31:17 -08:00
storage feat: Embed on-instance Whisper model for audio/mp4 transcribing (#449) 2023-12-15 11:20:13 -08:00
utils Feature/drupalwiki collector (#3693) 2025-04-21 09:17:24 -07:00
.env.example devcontainer v1 (#297) 2024-01-08 15:31:06 -08:00
.gitignore Document Processor v2 (#442) 2023-12-14 15:14:56 -08:00
.nvmrc Document Processor v2 (#442) 2023-12-14 15:14:56 -08:00
index.js Add querySelectorAll capability to web-scraping block (#3186) 2025-02-13 16:11:15 -08:00
nodemon.json Document Processor v2 (#442) 2023-12-14 15:14:56 -08:00
package.json Windows development environment variables support (#3354) 2025-02-27 10:43:31 -08:00
yarn.lock Windows development environment variables support (#3354) 2025-02-27 10:43:31 -08:00