* Create parse endpoint in collector (#4212)
* create parse endpoint in collector
* revert cleanup temp util call
* lint
* remove unused cleanupTempDocuments function
* revert slug change
minor change for destinations
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* Add parsed files table and parse server endpoints (#4222)
* add workspace_parsed_files table + parse endpoints/models
* remove dev api parse endpoint
* remove unneeded imports
* iterate over all files + remove unneeded update function + update telemetry debounce
* Upload UI/UX context window check + frontend alert (#4230)
* prompt user to embed if exceeds prompt window + handle embed + handle cancel
* add tokenCountEstimate to workspace_parsed_files + optimizations
* use util for path locations + use safeJsonParse
* add modal for user decision on overflow of context window
* lint
* dynamic fetching of provider/model combo + inject parsed documents
* remove unneeded comments
* popup ui for attaching/removing files + warning to embed + wip fetching states on update
* remove prop drilling, fetch files/limits directly in attach files popup
* rework ux of FE + BE optimizations
* fix ux of FE + BE optimizations
* Implement bidirectional sync for parsed file states
linting
small changes and comments
* move parse support to another endpoint file
simplify calls and loading of records
* button borders
* enable default users to upload parsed files but NOT embed
* delete cascade on user/workspace/thread deletion to remove parsedFileRecord
* enable bgworker with "always" jobs and optional document sync jobs
orphan document job: Will find any broken reference files to prevent overpollution of the storage folder. This will run 10s after boot and every 12hr after
* change run timeout for orphan job to 1m to allow settling before spawning a worker
* linting and cleanup pr
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* dev build
* fix tooltip hiding during embedding overflow files
* prevent crash log from ERRNO on parse files
* unused import
* update docs link
* Migrate parsed-files to GET endpoint
patch logic for grabbing models names from utils
better handling for undetermined context windows (null instead of Pos_INIFI)
UI placeholder for null context windows
* patch URL
---------
Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
* wip bg workers for live document sync
* Add ability to re-embed specific documents across many workspaces via background queue
bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled
UI for watching/unwatching docments that are embedded.
TODO: UI to easily manage all bg tasks and see run results
TODO: UI to enable this feature and background endpoints to manage it
* create frontend views and paths
Move elements to correct experimental scope
* update migration to delete runs on removal of watched document
* Add watch support to YouTube transcripts (#1716)
* Add watch support to YouTube transcripts
refactor how sync is done for supported types
* Watch specific files in Confluence space (#1718)
Add failure-prune check for runs
* create tmp workflow modifications for beta image
* create tmp workflow modifications for beta image
* create tmp workflow modifications for beta image
* dual build
update copy of alert modals
* update job interval
* Add support for live-sync of Github files
* update copy for document sync feature
* hide Experimental features from UI
* update docs links
* [FEAT] Implement new settings menu for experimental features (#1735)
* implement new settings menu for experimental features
* remove unused context save bar
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* dont run job on boot
* unset workflow changes
* Add persistent encryption service
Relay key to collector so persistent encryption can be used
Encrypt any private data in chunkSources used for replay during resync jobs
* update jsDOC
* Linting and organization
* update modal copy for feature
---------
Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>