michael/merlyn - merlyn - Honeywest

michael/merlyn

Author	SHA1	Message	Date
Marcello Fitton	eb77876127	Add HTTP request/response logging middleware for development mode (#4425 ) * Add HTTP request logging middleware for development mode - Introduced httpLogger middleware to log HTTP requests and responses. - Enabled logging only in development mode to assist with debugging. * Update httpLogger middleware to disable time logging by default * Add httpLogger middleware for development mode in collector service * Refactor httpLogger middleware to rename timeLogs parameter to enableTimestamps for clarity * Make HTTP Logger only mount in development and environment flag is enabled. * Update .env.example to clarify HTTP Logger configuration comments --------- Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2025-09-29 13:33:15 -07:00
Jonas Stawski	b8d4cc3454	Added metadata parameter to document/upload, document/upload/{folderName}, and document/upload-link (#4342 ) * Added the ability to pass in metadata to the /document/upload/{folderName} endpoint * Added the ability to pass in metadata to the /document/upload-link endpoint * feat: added metadata to document/upload api endpoint * simplify optional metadata in document dev api endpoints * lint * patch handling of metadata in dev api * Linting, small comments --------- Co-authored-by: jstawskigmi <jstawski@getmyinterns.org> Co-authored-by: shatfield4 <seanhatfield5@gmail.com> Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2025-09-17 11:17:29 -07:00
Timothy Carambat	0fb33736da	Workspace Chat with documents overhaul (#4261 ) * Create parse endpoint in collector (#4212) * create parse endpoint in collector * revert cleanup temp util call * lint * remove unused cleanupTempDocuments function * revert slug change minor change for destinations --------- Co-authored-by: timothycarambat <rambat1010@gmail.com> * Add parsed files table and parse server endpoints (#4222) * add workspace_parsed_files table + parse endpoints/models * remove dev api parse endpoint * remove unneeded imports * iterate over all files + remove unneeded update function + update telemetry debounce * Upload UI/UX context window check + frontend alert (#4230) * prompt user to embed if exceeds prompt window + handle embed + handle cancel * add tokenCountEstimate to workspace_parsed_files + optimizations * use util for path locations + use safeJsonParse * add modal for user decision on overflow of context window * lint * dynamic fetching of provider/model combo + inject parsed documents * remove unneeded comments * popup ui for attaching/removing files + warning to embed + wip fetching states on update * remove prop drilling, fetch files/limits directly in attach files popup * rework ux of FE + BE optimizations * fix ux of FE + BE optimizations * Implement bidirectional sync for parsed file states linting small changes and comments * move parse support to another endpoint file simplify calls and loading of records * button borders * enable default users to upload parsed files but NOT embed * delete cascade on user/workspace/thread deletion to remove parsedFileRecord * enable bgworker with "always" jobs and optional document sync jobs orphan document job: Will find any broken reference files to prevent overpollution of the storage folder. This will run 10s after boot and every 12hr after * change run timeout for orphan job to 1m to allow settling before spawning a worker * linting and cleanup pr --------- Co-authored-by: Timothy Carambat <rambat1010@gmail.com> * dev build * fix tooltip hiding during embedding overflow files * prevent crash log from ERRNO on parse files * unused import * update docs link * Migrate parsed-files to GET endpoint patch logic for grabbing models names from utils better handling for undetermined context windows (null instead of Pos_INIFI) UI placeholder for null context windows * patch URL --------- Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>	2025-08-11 09:26:19 -07:00
Sean Hatfield	610bdd4673	Allow custom headers in upload-link endpoint (#3695 ) * allow custom headers in upload-link endpoint * override loader.scrape to allow for passing of headers in langchain puppeteer * lint * Rename some variables move positional args to named args update documentation to reflect arg changes and funciton sigs validate header object before attempting to end to forward to request * update header validation for custom headers --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2025-04-22 12:47:12 -07:00
Timothy Carambat	b6d3a411b1	Add `querySelectorAll` capability to web-scraping block (#3186 ) * Add `querySelectorAll` capability to web-scraping block * patches and fallbacks * fix styles of text in web scraping block --------- Co-authored-by: shatfield4 <seanhatfield5@gmail.com>	2025-02-13 16:11:15 -08:00
Blazej Owczarczyk	348d9c8285	Add 3GB file size limit to body parser middlewares (#2390 )	2024-09-30 11:19:41 -07:00
Timothy Carambat	29c9eeaa5c	Add `winston` logging for production (#1811 )	2024-07-03 16:39:33 -07:00
Timothy Carambat	a5bb77f97a	Agent support for `@agent` default agent inside workspace chat (#1093 ) V1 of agent support via built-in `@agent` that can be invoked alongside normal workspace RAG chat.	2024-04-16 10:50:10 -07:00
Timothy Carambat	f4088d9348	RSA-Signing on server<->collector communication via API (#1005 ) * WIP integrity check between processes * Implement integrity checking on document processor payloads	2024-04-01 13:56:35 -07:00
Timothy Carambat	0ada882991	Support external transcription providers (#909 ) * Support External Transcription providers * patch files * update docs * fix return data	2024-03-14 15:43:26 -07:00
Timothy Carambat	48cb8f2897	Add support to upload rawText document via api (#692 ) * Add support to upload rawText document via api * update API doc endpoint with correct textContent key * update response swagger doc	2024-02-07 15:17:32 -08:00
Timothy Carambat	b35feede87	570 document api return object (#608 ) * Add support for fetching single document in documents folder * Add document object to upload + support link scraping via API * hotfixes for documentation * update api docs	2024-01-16 16:04:22 -08:00
Timothy Carambat	452582489e	GitHub loader extension + extension support v1 (#469 ) * feat: implement github repo loading fix: purge of folders fix: rendering of sub-files * noshow delete on custom-documents * Add API key support because of rate limits * WIP for frontend of data connectors * wip * Add frontend form for GitHub repo data connector * remove console.logs block custom-documents from being deleted * remove _meta unused arg * Add support for ignore pathing in request Ignore path input via tagging * Update hint	2023-12-18 15:48:02 -08:00
Timothy Carambat	61db981017	feat: Embed on-instance Whisper model for audio/mp4 transcribing (#449 ) * feat: Embed on-instance Whisper model for audio/mp4 transcribing resolves #329 * additional logging * add placeholder for tmp folder in collector storage Add cleanup of hotdir and tmp on collector boot to prevent hanging files split loading of model and file conversion into concurrency * update README * update model size * update supported filetypes	2023-12-15 11:20:13 -08:00
Timothy Carambat	719521c307	Document Processor v2 (#442 ) * wip: init refactor of document processor to JS * add NodeJs PDF support * wip: partity with python processor feat: add pptx support * fix: forgot files * Remove python scripts totally * wip:update docker to boot new collector * add package.json support * update dockerfile for new build * update gitignore and linting * add more protections on file lookup * update package.json * test build * update docker commands to use cap-add=SYS_ADMIN so web scraper can run update all scripts to reflect this remove docker build for branch	2023-12-14 15:14:56 -08:00