* WIP agentic tool call streaming
- OpenAI
- Anthropic
- Azure OpenAI
* WIP rest of providers EXCLUDES Bedrock and GenericOpenAI
* patch untooled complete/streaming to use chatCallback provider from provider class and not assume OpenAI client struct
example: Ollama
* modify ollama to function with its own overrides
normalize completion/stream outputs across providers/untooled
* dev build
* fix message sanization for anthropic agent streaming
* wip fix anthropic agentic streaming sanitization
* patch gemini, webgenui, generic aibitat providers + disable providers unable to test
* refactor anthropic aibitat provider for empty message and tool call formatting
* Add frontend missing prop check
update Azure for streaming support
update Gemini to streamting support on gemini-* models
generic OpenAI disable streaming
verify localAI support
verify NVIDIA Nim support
* DPAIS, remove temp from call, support streaming'
* remove 0 temp to remove possibility of bad temp error/500s/400s
* Patch condition where model is non-streamable and no tools are present or called resulting in the provider `handleFunctionCallChat` being called - which returns a string.
This would then fail in Untooled.complete since response would be a string and not the expected `response.choices?.[0]?.message`
Modified this line to handle both conditions for stream/non-streaming and tool presence or lack thereof
* Allow generic Openai to be streamable since using untooled it should work fine
honor disabled streaming for provider where that concern may apply for regular chats
* rename function and more gemini-specific function to gemini provider
* add comments for readability
.complete on azure should be non-streaming as this is the sync response
* migrate CometAPI, but disable as we cannot test
---------
Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
* Fix system prompt variable color logic by removing unused variable type from switch statement and adding new types.
* Add workspace id, name and user id as default system prompt variables
* Combine user and workspace variable evaluations into a single if statment, reducing redundant code.
* minor refactor
* add systemPromptVariable expandSystemPromptVariables test cases
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* Fix JSDOC for updateOrCreateCollection
* Add sanitizeForJsonb method to PGVector for safe JSONB handling
This new method recursively sanitizes values intended for JSONB storage, removing disallowed control characters and ensuring safe insertion into PostgreSQL. The method is integrated into the vector insertion process to sanitize metadata before database operations.
* Add unit tests for PGVector.sanitizeForJsonb method
This commit introduces a comprehensive test suite for the PGVector.sanitizeForJsonb method, ensuring it correctly handles various input types, including null, undefined, strings with disallowed control characters, objects, arrays, and Date objects. The tests verify that the method sanitizes inputs without mutating the original data structures.
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* Add HTTP request logging middleware for development mode
- Introduced httpLogger middleware to log HTTP requests and responses.
- Enabled logging only in development mode to assist with debugging.
* Update httpLogger middleware to disable time logging by default
* Add httpLogger middleware for development mode in collector service
* Refactor httpLogger middleware to rename timeLogs parameter to enableTimestamps for clarity
* Make HTTP Logger only mount in development and environment flag is enabled.
* Update .env.example to clarify HTTP Logger configuration comments
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* Add className property to various LLM and embedder classes to fix logging bug after minification
* Fix bug with this.log method by applying the missing private field symbol
* migrate openai llm provider to use responses api
* add back image support
* dont recalc tokens from OpenAI since we get metrics back
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* Add ENV to configure api request delay for generic open ai embedding engine
* yarn lint formatting
* refactor
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* Add User-Agent header on the requests sent by Generic OpenAI providers.
* Moved getAnythingLLMUserAgent helper fn to server/endpoints/utils.js and changed fallback version string to "unknown"
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* Added the ability to pass in metadata to the /document/upload/{folderName} endpoint
* Added the ability to pass in metadata to the /document/upload-link endpoint
* feat: added metadata to document/upload api endpoint
* simplify optional metadata in document dev api endpoints
* lint
* patch handling of metadata in dev api
* Linting, small comments
---------
Co-authored-by: jstawskigmi <jstawski@getmyinterns.org>
Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* feat: Implement CometAPI integration for chat completions and model management
- Added CometApiLLM class for handling chat completions using CometAPI.
- Implemented model synchronization and caching mechanisms.
- Introduced streaming support for chat responses with timeout handling.
- Created CometApiProvider class for agent interactions with CometAPI.
- Enhanced error handling and logging throughout the integration.
- Established a structure for managing function calls and completions.
* linting
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* export image support for json and jsonl
* add tests and cleanup functionality
* add test for convertTo prepare function
* comment
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* add chroma cloud as new vector db provider
* update docker example env
* extend chroma class to chroma cloud
* update readme
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* Added exa-search case to the search provider switch in web-browsing.js
* Added ExaSearchOptions component for API key input
* update
* Patch missing image crashing UI
Fix issue where ENV key did not exist or was saved on click
Update copy for provider
Add Docs for ENV keys for manual placements
update systemssettings for returning key saved to UI
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* Create parse endpoint in collector (#4212)
* create parse endpoint in collector
* revert cleanup temp util call
* lint
* remove unused cleanupTempDocuments function
* revert slug change
minor change for destinations
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* Add parsed files table and parse server endpoints (#4222)
* add workspace_parsed_files table + parse endpoints/models
* remove dev api parse endpoint
* remove unneeded imports
* iterate over all files + remove unneeded update function + update telemetry debounce
* Upload UI/UX context window check + frontend alert (#4230)
* prompt user to embed if exceeds prompt window + handle embed + handle cancel
* add tokenCountEstimate to workspace_parsed_files + optimizations
* use util for path locations + use safeJsonParse
* add modal for user decision on overflow of context window
* lint
* dynamic fetching of provider/model combo + inject parsed documents
* remove unneeded comments
* popup ui for attaching/removing files + warning to embed + wip fetching states on update
* remove prop drilling, fetch files/limits directly in attach files popup
* rework ux of FE + BE optimizations
* fix ux of FE + BE optimizations
* Implement bidirectional sync for parsed file states
linting
small changes and comments
* move parse support to another endpoint file
simplify calls and loading of records
* button borders
* enable default users to upload parsed files but NOT embed
* delete cascade on user/workspace/thread deletion to remove parsedFileRecord
* enable bgworker with "always" jobs and optional document sync jobs
orphan document job: Will find any broken reference files to prevent overpollution of the storage folder. This will run 10s after boot and every 12hr after
* change run timeout for orphan job to 1m to allow settling before spawning a worker
* linting and cleanup pr
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* dev build
* fix tooltip hiding during embedding overflow files
* prevent crash log from ERRNO on parse files
* unused import
* update docs link
* Migrate parsed-files to GET endpoint
patch logic for grabbing models names from utils
better handling for undetermined context windows (null instead of Pos_INIFI)
UI placeholder for null context windows
* patch URL
---------
Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
* WIP on mobile connections
todo: register devices
todo: data sync or connection
* improve connection flow and registration
add streaming from service
TODO: user scoping
* dev build mobile support
* fix path
* handle relative URLs
* handle localhost access in product
* add device de-register
* sync styles
* move UI to be out of the normal path since beta only
* Add user scoping to mobile connection requests
Remigrate DB for user associations
Implement temp token registration to prevent unauthorized device registration requests
cleanup middlewares
* WIP on embedder selection
TODO: apply splitting and query prefixes (if applicable)
* wip on upsert
* Support base model
support nomic-text-embed-v1
support multilingual-e5-small
Add prefixing for both embedding and query for RAG tasks
Add chunking prefix to all vector dbs to apply prefix when possible
Show dropdown and auto-pull on new selection
* norm translations
* move supported models to constants
handle null seelction or invalid selection on dropdown
update comments
* dev
* patch text splitter maximums for now
* normalize translations
* add tests for splitter functionality
* normalize
---------
Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
* fix multimodal chats via openai compat api
* lint
* add tests for multi-modal content in openai compat endpoint
* refactor to normalize how openai attachments are handled
* uncheck file
* rewrite tests, autodetect mime from dataurl, and spread attachments from prompt
* lint
* revert and fix tests
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* configurable message limit for embed widget
* remove console log
* make field optional + add fallback
* rework validation logic
* lint
* remove field specific guard, it cannot be lte 0 like all other fields
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* allow static defined prompt variables to be accessed by any authd user
* improve filtering logic and comments
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* Enable UI/UX for model swapping in chat window
* forgot component
* patch useGetProviders hook to set loading on change of provider
* dev build
* normalize translations
* patch how model default is provided
---------
Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
* implement importing of agent flows from community hub
* auto enable flow on import
* remove unused blocks for docker
prevent importing or saving of agent flows that have unsupported blocks for version or platform
* dev build
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* wip: create direct output switch on last block and send response to ui
* lint
* Return flow on direct output enabled
prevent new blocks below direct output block
Update executor/aibitat to handle skipping of handler outputs
* dev build
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* implement json parser for agent flow to allow dot notation and array access
* lint
* patch parser for pathing on objects
add tests for cases
* Move webscraping deps to closure
update tests to not modify env since no longer needed
do not modify paths with spaces - could be text key with spaces
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* Fix: MCP server environment inheritance across platforms
- Fix GUI applications not inheriting proper PATH/NODE_PATH
- Correct Docker NODE_PATH to point to modules directory
- Ensure base environment is always provided
- Preserve user environment variable overrides
- Resolves -32000 Connection closed errors on macOS/Linux GUI
* Fix: MCP server environment inheritance across platforms after linting
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* PGVector support for vector db storage
* forgot files
* comments
* dev build
* Add ENV connection and table schema validations for vector table
add .reset call to drop embedding table when changing the AnythingLLM embedder
update instrutions
Add preCheck error reporting in UpdateENV
add timeout to pg connection
* update setup
* update README
* update doc
* feat: implement iam role auth for bedrock
* fix: make client refreshes properly when switching between iam_user and iam_role
* checkout agent flow
* fix aiprovider for bedrock in agent use
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* Fixed two primary issues discovered while using AWS Bedrock with Anthropic Claude Sonnet models:
- Context Window defaults to 8192 maximum, which isn't correct
- Multimodal stopped working when removing langchain, which was transparently handling image_url to a format sonnet expects.
* Ran `yarn lint`
* Updated .env.example to have aws bedrock examples too
* Refactor for readability
move utils for AWS specific functionality to subfile
add token output max to ENV so setting persits
---------
Co-authored-by: Tristan Stahnke <tristan.stahnke+gpsec@guidepointsecurity.com>
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* allow custom headers in upload-link endpoint
* override loader.scrape to allow for passing of headers in langchain puppeteer
* lint
* Rename some variables
move positional args to named args
update documentation to reflect arg changes and funciton sigs
validate header object before attempting to end to forward to request
* update header validation for custom headers
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* Enable bypass of ip limitations via ENV in collector startup
resolves#3625
connect #3626
* dev build
* bump dockerx build action
* enable runtime setting config of collector requests
* comments and linting for option passing
* unset
* unset
* update docs link
* linting and docs
* Update Azure AI options and model map with new model configurations
* linting
---------
Co-authored-by: Shinya Suzuki <shinya.s.825@gmail.com>
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* handling of citations in openRouter provider #3581
* Update pplx enrichToken function comment
Modify OR enrichToken to be generic handler function with optional params
handle _just_ Perplexity in-line citations since no other models support this functionality
* remove console log
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* Update defaultModels.js
add gemma-3-27b-it to v1BetaModels
* Update defaultModels.js
20250330 model update
* Update defaultModels.js
remove text embedding
* Update name and inputTokenLimit modelMap.js
* Update gemini to load models from both endpoints
dedupe models
decide endpoint based on expieremental status from fetch
add util script for maintainers
reduce cache time on gemini models to 1 day
* remove comment
---------
Co-authored-by: DreamerC <dreamerwolf.tw@gmail.com>
* WIP MCP full compatibility layer
* implement MCP agent function wrapping and invocation methods
* Add `uvx` to docker bin for MCP executions
* dev build
* prune removed data
* Wrap MCP servers to lazy load items to not block the UI
Mobile bug fixes
* arm64 test build
* reset dev builder
* remove unused prop
* enable slash commands in dev api
* lint
* Remove ability to use default slash commands in API request
Add `reset` param to body that can reset chats according to the api chat execution parameters
Allow null `message` if `reset` is set in request.
Added early return for if message is null and reset is true
Enable chat to reset chat history and continue `message` execution
Added generic WorkspaceChat history reset function. Deprecated others
* update grep function comment
remove debug
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* Add multilingual support for ocr mudule
* Add OCR langauge as server var that is passed into Collector
Support all valid tesseract language codes
Filter and parse only valid codes with fallbacks'
* persist TARGET_OCR_LANG
* update docker example env
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* feat: add new model provider PPIO
* fix: fix ppio model fetching
* fix: code lint
* reorder LLM
update interface for streaming and chats to use valid keys
linting
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* Windows development environment variables support
* moved cross-env to dev dependencies
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* add bio to users table
* lint
* add bio field to edit user admin page
* fix bio saving on new user
* simplify updating localstorage user
* linting
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* feat: Add endpoint to retrieve documents by folder name
* isWithin Check on path to prevent path traversal
* feat: Add endpoint to upload documents to a specified folder
* refactor upload to folder endpoint + update jsdoc for swagger
* linting
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
* add claude 3-7 sonnet
* made all the changes everywhere
* add 3-7-sonnet-latest model
* lint
---------
Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
* feat: Add endpoint to retrieve documents by folder name
* isWithin Check on path to prevent path traversal
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
bug fixes for sanitizing Namespaces and handling chunk size limit of astradb collections in each doc
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* Add `querySelectorAll` capability to web-scraping block
* patches and fallbacks
* fix styles of text in web scraping block
---------
Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
* chore: rename Github to GitHub
Signed-off-by: Adam Setch <adam.setch@outlook.com>
* chore: rename Github to GitHub
Signed-off-by: Adam Setch <adam.setch@outlook.com>
* Undo some code changes for references
---------
Signed-off-by: Adam Setch <adam.setch@outlook.com>
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* Add tokenizer improvments via Singleton class
linting
* dev build
* Estimation fallback when string exceeds a fixed byte size
* Add notice to tiktoken on backend
* feat: add support for voyage-3-large and voyage-code-3 embedding models
- Add voyage-3-large and voyage-code-3 to VoyageAiOptions dropdown
- Update getMaxEmbeddingLength to support 32k context for new models
- Update .env.example with new model options
* unset env example
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* Reranker WIP
* add cacheing and singleton loading
* Add field to workspaces for vectorSearchMode
Add UI for lancedb to change mode
update all search endpoints to pass in reranker prop if provider can use it
* update hint text
* When reranking, swap score to rerank score
* update optchain
* Add support for Google Generative AI (Gemini) embedder
* Add missing example in docker
Fix UI key elements in options
Add Gemini to data handling section
Patch issues with chunk handling during embedding
* remove dupe in env
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* Add support for gemini authenticated models endpoint
add customModels entry
add un-authed fallback to default listing
separate models by expiermental status
resolves#2866
* add back improved logic for apiVersion decision making
* add writible fields to dev api new workspace endpoint
* lint
* implement validations for workspace model
* update swagger comments
* simplify validations for workspace on frontend and API
* cleanup validations
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* wip remove all docs clear vector db on embedder/vector db change
* purge all cached docs and remove docs from workspaces on vectordb/embedder change
* lint
* remove unneeded console log
* remove reset vector stores endpoint and move to server side updateENV with postUpdate check
* reset embed module
* remove unused import
* simplify deletion process
rescoped document deletion to be more general for speed, everything needs to be reset anyway
fixed issue where unembedded docs not in any workspaces, but cached, were not removed
* add back missing readme file
update warning text modals
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* Add vector search API endpoint
* Add missing import
* Modify the data that is returned
* Change similarityThreshold to scoreThreshold
As this is what is actually returned by the search
* Removing logging (oops!)
* chore: regen swagger docs for new endpoint
fix: update function to sanity check values to prevent crashes during search
---------
Co-authored-by: Scott Bowler <scott@dcsdigital.co.uk>
* wip hub connection page fe + backend
* lint
* implement backend for local hub items + placeholder endpoints to fetch hub app data
* fix hebrew translations
* revamp community integration flow
* change sidebar
* Auto import if id in URL param
remove preview in card screen and instead go to import flow
* get user's items + team items from hub + ui improvements to hub settings
* lint
* fix merge conflict
* refresh hook for community items
* add fallback for user items
* Disable bundle items by default on all instances
* remove translations (will complete later)
* loading skeleton
* Make community hub endpoints admin only
show visibility on items
combine import/apply for items to they are event logged for review
* improve middleware and import flow
* community hub ui updates
* Adjust importing process
* community hub to dev
* Add webscraper preload into imported plugins
* add runtime property to plugins
* Fix button status on imported skill change
show alert on skill change
Update markdown type and theme on import of agent skill
* update documentaion paths
* remove unused import
* linting
* review loading state
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* exposes `maxConcurrentChunks` parameter for the generic openai embedder through configuration. This allows setting a batch size for endpoints which don't support the default of 500
* Update new field to new UI
make getting to ensure proper type and format
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* togetherai llama 3.2 vision models support
* remove console log
* fix listing to reflect what is on the chart
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>