Commit Graph

16 Commits

Author SHA1 Message Date
Timothy Carambat
3dedcede34
Filesystem Agent Skill overhaul (#5260)
* wip

* collector parse fixes

* refactor for class and also operation for reading

* add skill management panel

* management panel + lint

* management panel + lint

* Hide skill in non-docker context

* add ask-prompt for edit tool calls

* fix dep

* fix execa pkg (unused in codebase)

* simplify search with ripgrep only and build deps

* Fs skill i18n (#5264)

i18n

* add copy file support

* fix translations
2026-03-26 14:07:46 -07:00
timothycarambat
3b4f07cdbd add longer HTTP ttl on forward extension requests
resolves #4605
2025-11-20 23:00:18 -08:00
timothycarambat
4ec85418c4 Solve theoretical bug in forwardRequestSigner
resolves #4611
2025-11-20 18:36:10 -08:00
Timothy Carambat
95557ee16f
Allow user to specify args for chromium process so they dont need SYS_ADMIN on container. (#4397)
* allow user to specify args for chromium process so they dont need SYS_ADMIN perms

* use arg flag content

* update console outputs
2025-09-17 16:31:08 -07:00
Jonas Stawski
b8d4cc3454
Added metadata parameter to document/upload, document/upload/{folderName}, and document/upload-link (#4342)
* Added the ability to pass in metadata to the /document/upload/{folderName} endpoint

* Added the ability to pass in metadata to the /document/upload-link endpoint

* feat: added metadata to document/upload api endpoint

* simplify optional metadata in document dev api endpoints

* lint

* patch handling of metadata in dev api

* Linting, small comments

---------

Co-authored-by: jstawskigmi <jstawski@getmyinterns.org>
Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-09-17 11:17:29 -07:00
Timothy Carambat
0fb33736da
Workspace Chat with documents overhaul (#4261)
* Create parse endpoint in collector (#4212)

* create parse endpoint in collector

* revert cleanup temp util call

* lint

* remove unused cleanupTempDocuments function

* revert slug change
minor change for destinations

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>

* Add parsed files table and parse server endpoints (#4222)

* add workspace_parsed_files table + parse endpoints/models

* remove dev api parse endpoint

* remove unneeded imports

* iterate over all files + remove unneeded update function + update telemetry debounce

* Upload UI/UX context window check + frontend alert (#4230)

* prompt user to embed if exceeds prompt window + handle embed + handle cancel

* add tokenCountEstimate to workspace_parsed_files + optimizations

* use util for path locations + use safeJsonParse

* add modal for user decision on overflow of context window

* lint

* dynamic fetching of provider/model combo + inject parsed documents

* remove unneeded comments

* popup ui for attaching/removing files + warning to embed + wip fetching states on update

* remove prop drilling, fetch files/limits directly in attach files popup

* rework ux of FE + BE optimizations

* fix ux of FE + BE optimizations

* Implement bidirectional sync for parsed file states
linting
small changes and comments

* move parse support to another endpoint file
simplify calls and loading of records

* button borders

* enable default users to upload parsed files but NOT embed

* delete cascade on user/workspace/thread deletion to remove parsedFileRecord

* enable bgworker with "always" jobs and optional document sync jobs
orphan document job: Will find any broken reference files to prevent overpollution of the storage folder. This will run 10s after boot and every 12hr after

* change run timeout for orphan job to 1m to allow settling before spawning a worker

* linting and cleanup pr

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>

* dev build

* fix tooltip hiding during embedding overflow files

* prevent crash log from ERRNO on parse files

* unused import

* update docs link

* Migrate parsed-files to GET endpoint
patch logic for grabbing models names from utils
better handling for undetermined context windows (null instead of Pos_INIFI)
UI placeholder for null context windows

* patch URL

---------

Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
2025-08-11 09:26:19 -07:00
Sean Hatfield
610bdd4673
Allow custom headers in upload-link endpoint (#3695)
* allow custom headers in upload-link endpoint

* override loader.scrape to allow for passing of headers in langchain puppeteer

* lint

* Rename some variables
move positional args to named args
update documentation to reflect arg changes and funciton sigs
validate header object before attempting to end to forward to request

* update header validation for custom headers

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2025-04-22 12:47:12 -07:00
Timothy Carambat
1601eb986c
Enable bypass of ip limitations via ENV in collector processing (#3652)
* Enable bypass of ip limitations via ENV in collector startup
resolves #3625
connect #3626

* dev build

* bump dockerx build action

* enable runtime setting config of collector requests

* comments and linting for option passing

* unset

* unset

* update docs link

* linting and docs
2025-04-21 11:10:41 -07:00
AbelDuan
df166eb64e
feat: Add multilingual support for ocr module (#3325)
* Add multilingual support for ocr mudule

* Add OCR langauge as server var that is passed into Collector
Support all valid tesseract language codes
Filter and parse only valid codes with fallbacks'

* persist TARGET_OCR_LANG

* update docker example env

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-02-27 12:31:17 -08:00
Timothy Carambat
b6d3a411b1
Add querySelectorAll capability to web-scraping block (#3186)
* Add `querySelectorAll` capability to web-scraping block

* patches and fallbacks

* fix styles of text in web scraping block

---------

Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
2025-02-13 16:11:15 -08:00
Timothy Carambat
dc4ad6b5a9
[BETA] Live document sync (#1719)
* wip bg workers for live document sync

* Add ability to re-embed specific documents across many workspaces via background queue
bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled
UI for watching/unwatching docments that are embedded.
TODO: UI to easily manage all bg tasks and see run results
TODO: UI to enable this feature and background endpoints to manage it

* create frontend views and paths
Move elements to correct experimental scope

* update migration to delete runs on removal of watched document

* Add watch support to YouTube transcripts (#1716)

* Add watch support to YouTube transcripts
refactor how sync is done for supported types

* Watch specific files in Confluence space (#1718)

Add failure-prune check for runs

* create tmp workflow modifications for beta image

* create tmp workflow modifications for beta image

* create tmp workflow modifications for beta image

* dual build
update copy of alert modals

* update job interval

* Add support for live-sync of Github files

* update copy for document sync feature

* hide Experimental features from UI

* update docs links

* [FEAT] Implement new settings menu for experimental features (#1735)

* implement new settings menu for experimental features

* remove unused context save bar

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>

* dont run job on boot

* unset workflow changes

* Add persistent encryption service
Relay key to collector so persistent encryption can be used
Encrypt any private data in chunkSources used for replay during resync jobs

* update jsDOC

* Linting and organization

* update modal copy for feature

---------

Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
2024-06-21 13:38:50 -07:00
Timothy Carambat
1a5aacb001
Support multi-model whispers (#1444) 2024-05-17 21:31:29 -07:00
Timothy Carambat
a5bb77f97a
Agent support for @agent default agent inside workspace chat (#1093)
V1 of agent support via built-in `@agent` that can be invoked alongside normal workspace RAG chat.
2024-04-16 10:50:10 -07:00
Timothy Carambat
f4088d9348
RSA-Signing on server<->collector communication via API (#1005)
* WIP integrity check between processes

* Implement integrity checking on document processor payloads
2024-04-01 13:56:35 -07:00
Timothy Carambat
0ada882991
Support external transcription providers (#909)
* Support External Transcription providers

* patch files

* update docs

* fix return data
2024-03-14 15:43:26 -07:00
Timothy Carambat
aad32db5e3
Migrate document processor to class (#735)
* Migrate document processor to class

* forgot "new"
2024-02-16 16:32:25 -08:00