Default fallback increased from 8,000 to 2,000,000 tokens.
When provider doesn't declare promptWindowLimit (e.g. Grok via OpenRouter),
the fallback was far too small causing severe truncation.
Configure via AGENT_CONTEXT_WINDOW_FALLBACK in .env to override.
* add API key param to Lemonade LLM Provider and Embedding Provider
* add LEMONADE_LLM_API_KEY to .env.example
* add api key to aibitat provider
* fix api key from being sent to frontend
* fix tooltip id
* add null fallback for `apiKey`
* remove console log
* add missing api keys
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* Add automatic chat mode with native tool calling support
Introduces a new automatic chat mode (now the default) that automatically invokes tools when the provider supports native tool calling. Conditionally shows/hides the @agent command based on whether native tooling is available.
- Add supportsNativeToolCalling() to AI providers (OpenAI, Anthropic, Azure always support; others opt-in via ENV)
- Update all locale translations with new mode descriptions
- Enhance translator to preserve Trans component tags
- Remove deprecated ability tags UI
* rebase translations
* WIP on image attachments. Supports initial image attachment + subsequent attachments
* persist images
* Image attachments and updates for providers
* desktop pre-change
* always show command on failure
* add back gemini streaming detection
* move provider native tooling flag to Provider func
* whoops - forgot to delete
* strip "@agent" from prompts to prevent weird replies
* translations for automatic-mode (#5145)
* translations for automatic-mode
* rebase
* translations
* lint
* fix dead translations
* change default for now to chat mode just for rollout
* remove pfp for workspace
* passthrough workspace for showAgentCommand detection and rendering
* Agent API automatic mode support
* ephemeral attachments passthrough
* support reading of pinned documents in agent context
Introduces a new automatic chat mode (now the default) that automatically invokes tools when the provider supports native tool calling. Conditionally shows/hides the @agent command based on whether native tooling is available.
- Add supportsNativeToolCalling() to AI providers (OpenAI, Anthropic, Azure always support; others opt-in via ENV)
- Update all locale translations with new mode descriptions
- Enhance translator to preserve Trans component tags
- Remove deprecated ability tags UI
* Improve DMR support
- Autodetect models installed
- Grab all models from hub.docker to show available
- UI to handle render,search, install, and management of models
- Support functionality for chat, stream, and agentic calls
* forgot files
* fix loader circle being too large
fix tooltip width command
adjust location of docker installer open for web platform
* adjust imports
* implement cohere agent support
* run yarn lint
* moderize Cohere
add supported langchain method
redo streaming since it was not working
looping of agent calls was not functioning
* change default model to real model tag
add case statement for model tag
* remove debug
* update default
* only whitelist known labels
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* Support Gitee AI(LLM Provider)
* refactor(server): 重构 GiteeAI 模型窗口限制功能,暂时将窗口限制硬编码,计划使用外部 API 数据和缓存
* updates for Gitee AI
* use legacy lookup since gitee does not enable getting token context windows
* add more missing records
* reorder imports
---------
Co-authored-by: 方程 <fangcheng@oschina.cn>
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* feat: add AWS Bedrock API Key option to settings panel
* feat: Bedrock API key auth method
* fix: hide IAM note when using bedrock api key
* move to camcelCase identifier for bedrock api key use
linting
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* add microsoft foundry local llm and agent providers
* minor change to fix early stop token + overloading of context window
always use user defined window _unless_ it is larger than the models real contenxt window
cache the context windows when we can from the API (0.7.*)+
Unload model forcefully on model change to prevent resource hogging
* add back token preference since some models have very large windows and can crash a machine
normalize cases
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* WIP agentic tool call streaming
- OpenAI
- Anthropic
- Azure OpenAI
* WIP rest of providers EXCLUDES Bedrock and GenericOpenAI
* patch untooled complete/streaming to use chatCallback provider from provider class and not assume OpenAI client struct
example: Ollama
* modify ollama to function with its own overrides
normalize completion/stream outputs across providers/untooled
* dev build
* fix message sanization for anthropic agent streaming
* wip fix anthropic agentic streaming sanitization
* patch gemini, webgenui, generic aibitat providers + disable providers unable to test
* refactor anthropic aibitat provider for empty message and tool call formatting
* Add frontend missing prop check
update Azure for streaming support
update Gemini to streamting support on gemini-* models
generic OpenAI disable streaming
verify localAI support
verify NVIDIA Nim support
* DPAIS, remove temp from call, support streaming'
* remove 0 temp to remove possibility of bad temp error/500s/400s
* Patch condition where model is non-streamable and no tools are present or called resulting in the provider `handleFunctionCallChat` being called - which returns a string.
This would then fail in Untooled.complete since response would be a string and not the expected `response.choices?.[0]?.message`
Modified this line to handle both conditions for stream/non-streaming and tool presence or lack thereof
* Allow generic Openai to be streamable since using untooled it should work fine
honor disabled streaming for provider where that concern may apply for regular chats
* rename function and more gemini-specific function to gemini provider
* add comments for readability
.complete on azure should be non-streaming as this is the sync response
* migrate CometAPI, but disable as we cannot test
---------
Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
* feat: Implement CometAPI integration for chat completions and model management
- Added CometApiLLM class for handling chat completions using CometAPI.
- Implemented model synchronization and caching mechanisms.
- Introduced streaming support for chat responses with timeout handling.
- Created CometApiProvider class for agent interactions with CometAPI.
- Enhanced error handling and logging throughout the integration.
- Established a structure for managing function calls and completions.
* linting
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* feat: implement iam role auth for bedrock
* fix: make client refreshes properly when switching between iam_user and iam_role
* checkout agent flow
* fix aiprovider for bedrock in agent use
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* feat: add new model provider PPIO
* fix: fix ppio model fetching
* fix: code lint
* reorder LLM
update interface for streaming and chats to use valid keys
linting
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* feat: add new model provider: Novita AI
* feat: finished novita AI
* fix: code lint
* remove unneeded logging
* add back log for novita stream not self closing
* Clarify ENV vars for LLM/embedder seperation for future
Patch ENV check for workspace/agent provider
---------
Co-authored-by: Jason <ggbbddjm@gmail.com>
Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
* Issue #1943: Add support for LLM provider - Fireworks AI
* Update UI selection boxes
Update base AI keys for future embedder support if needed
Add agent capabilites for FireworksAI
* class only return
---------
Co-authored-by: Aaron Van Doren <vandoren96+1@gmail.com>
* Enable agent context windows to be accurate per provider:model
* Refactor model mapping to external file
Add token count to document length instead of char-count
refernce promptWindowLimit from AIProvider in central location
* remove unused imports
* add LMStudio agent support (generic) support
"work" with non-tool callable LLMs, highly dependent on system specs
* add comments
* enable few-shot prompting per function for OSS models
* Add Agent support for Ollama models
* azure, groq, koboldcpp agent support complete + WIP togetherai
* WIP gemini agent support
* WIP gemini blocked and will not fix for now
* azure fix
* merge fix
* add localai agent support
* azure untooled agent support
* merge fix
* refactor implementation of several agent provideers
* update bad merge comment
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* add LMStudio agent support (generic) support
"work" with non-tool callable LLMs, highly dependent on system specs
* add comments
* enable few-shot prompting per function for OSS models