Commit Graph

208 Commits

Author SHA1 Message Date
Timothy Carambat
fef5bf06ec
add provider field to chats (#4959) 2026-02-02 20:11:18 -08:00
Timothy Carambat
97b140b4b4
Update LMStudio LLM & Embedder for API token (#4948)
- Updates Option panels to be consistent for other providers
adds API key to all LMStudio API calls
2026-01-30 11:13:32 -08:00
Timothy Carambat
0032c4da22
SambaNova Integration (#4943)
* SambaNova Integration

* lint
2026-01-29 18:48:22 -08:00
Timothy Carambat
b8dd7bc97e
Support PrivateModeAI Integration (#4937)
* Support PrivateModeAI Integration

* tooltip for proxy
2026-01-29 12:01:11 -08:00
Neha Prasad
3fc2432684
fix: prevent Citations UI glitching during streaming chats (#4897)
* fix: prevent Citations UI glitching during streaming chats

* replaced random keys with stable keys

* simplify citation glitch fix

* Remove unneeded memo()

* Simplify key logic

* Replace Boolean(source) with !!source

* change cohere to behave with citations like other models

---------

Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
Co-authored-by: Marcello Fitton <macfittondev@gmail.com>
2026-01-29 10:44:34 -08:00
Timothy Carambat
d36dc0f8a5 fix log line 2026-01-27 10:51:16 -08:00
Timothy Carambat
fe78e1c667
Refactor Ollama context window setting (#4909) 2026-01-27 10:50:40 -08:00
Timothy Carambat
cd5530de39
[Chore] Autotranslation tool using DMR (#4907)
* update translations + DMR loading

* updates to misspellings
2026-01-27 09:29:37 -08:00
Timothy Carambat
9191179c9e remove race condition regression for FoundryLocal provider 2026-01-16 16:45:05 -08:00
Timothy Carambat
607b5faf74
Extract Model Table to component (#4871)
* Extract Model Table to component
Add provider icons to header rows and installed models
Light mode supported
Mapping for model name id hints to provider
Update DMR to filter chat models by ability since not available via hub API

* linting + dev

* fix incorrect import
2026-01-16 16:34:58 -08:00
Timothy Carambat
e07963d3fc minor refactor for context window finder 2026-01-16 12:55:33 -08:00
Timothy Carambat
ff7cb17e34
Improved DMR support (#4863)
* Improve DMR support
- Autodetect models installed
- Grab all models from hub.docker to show available
- UI to handle render,search, install, and management of models
- Support functionality for chat, stream, and agentic calls

* forgot files

* fix loader circle being too large
fix tooltip width command
adjust location of docker installer open for web platform

* adjust imports
2026-01-14 15:55:26 -08:00
Timothy Carambat
7c3b7906e7
support AWS bedrock agents with streaming (#4850)
* support AWS bedrock agents with streaming

* Add back error handlers from previous fix
2026-01-09 15:36:58 -08:00
Timothy Carambat
133b62f9f6
patch AWS credential issue in docker context (#4842)
path AWS credential issue in docker context
2026-01-08 17:06:49 -08:00
Marcello Fitton
c4f19cec0e
Refactor LLMPerformanceMonitor.measureStream() to Use Options Object Pattern (#4786)
* Refactor LLMPerformanceMonitor to use options object for measureStream parameters

* Refactor invocations of `measureStream` to use options arguments

* Change invocation of `measureStream` in anthropic provider to use options argument

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-12-16 13:10:09 -08:00
Timothy Carambat
664f466e3f
4601 log model on response (#4781)
* add model tag to chatCompletion

* add modelTag `model` to async streaming
keeps default arguments for prompt token calculation where applied via explict arg

* fix HF default arg

* render all performance metrics as available for backward compatibility
add `timestamp` to both sync/async chat methods

* extract metrics string to function
2025-12-14 14:46:55 -08:00
Marcello Fitton
a7da757c84
Migrate Azure OpenAI Integration To v1 API | Enable Streaming for Reasoning Models in Azure OpenAI Basic Inference Provider (#4744)
* Refactor Azure OpenAI integration to use OpenAI SDK and the v1 API | Enable streaming for Azure Open AI basic inference provider

* Add info tooltip to inform user about 'Model Type' form field

* Add 'model_type_tooltip' key to multiple language translations

* Validate AZURE_OPENAI_ENDPOINT in provider construction

* remove unused import, update error handler, rescope URL utils

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-12-10 18:56:55 -08:00
方程
90e474abcb
Support Gitee AI(LLM Provider) (#3361)
* Support Gitee AI(LLM Provider)

* refactor(server): 重构 GiteeAI 模型窗口限制功能,暂时将窗口限制硬编码,计划使用外部 API 数据和缓存

* updates for Gitee AI

* use legacy lookup since gitee does not enable getting token context windows

* add more missing records

* reorder imports

---------

Co-authored-by: 方程 <fangcheng@oschina.cn>
Co-authored-by: timothycarambat <rambat1010@gmail.com>
2025-11-25 14:19:32 -08:00
Sean Hatfield
c913a2d68c
Prompt caching for Anthropic LLM and Agent providers (#4488)
* prompt caching for anthropic llm and agent providers

* add UI for control of ENV
simplify implementation

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-11-20 17:17:03 -08:00
Timothy Carambat
f0b3dab4c1
Simplify cache condition for LMStudio and Ollama to prevent race condition (#4669)
closes #4597
resolves #4572
closes #4600
resolves #4599
2025-11-20 16:32:02 -08:00
Sean Hatfield
49c29fb968
Z.ai LLM & agent provider (#4573)
* wip zai llm provider

* cleanup + add zai agent provider

* lint

* change how caching works for failed models

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-11-20 15:57:03 -08:00
Marcello Fitton
7a7ec969d7
Update Ollama AI Provider to Support Parsing "Thinking" Content From New Message Schema (#4587)
* add className prop to OllamaAILLM

* Enhance `OllamaAILLM.handleStream` to support parsing thinking content from the `message.thinking` property.

* refactor thinking property handler
patched ollama `@agent` flow calls

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2025-11-20 15:39:17 -08:00
Chetan Sarva
c169193fc4
feature: Support for AWS Bedrock API Keys (#4651)
* feat: add AWS Bedrock API Key option to settings panel

* feat: Bedrock API key auth method

* fix: hide IAM note when using bedrock api key

* move to camcelCase identifier for bedrock api key use
linting

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2025-11-20 15:38:45 -08:00
jonathanortega2023
7a0c149d2e
fix: Use eval_duration for output TPS calculations in Ollama LLM provider (#4568)
* fix: Use eval_duration for output TPS calculations and add as a metric field

* refactor usage of eval_duration from ollama metrics

* move eval_duration to usage

* overwrite duration in ollama provider wip measureAsyncFunction optional param

* allow for overloaded duration in measureAsyncFunction

* simplify flow for duration tracking

---------

Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-11-20 13:02:47 -08:00
Timothy Carambat
cf76bad452
Implement full chat and @agent chat user indentificiation for OpenRouter (#4668)
Implmenet chat and agentic chat user-id for OpenRouter
resolves #4553
closes #4482
2025-11-20 12:38:43 -08:00
timothycarambat
0a1a5a216a patch ollama context window error when unreachable 2025-10-06 16:25:06 -07:00
Timothy Carambat
c2e7ccc00f
Reimplement Cohere models for basic chat (#4489)
* Reimplement Cohere models
- Redo LLM implementation to grab models from endpoint and pre-filter
- Migrate embedding models to also grab from remote
- Add records for easy context window lookup'

* fix comment
2025-10-03 18:28:20 -07:00
Timothy Carambat
8cdadd8cb3
Sync models from remote for FireworksAI (#4475)
resolves #4474
2025-10-02 12:34:05 -07:00
Sean Hatfield
0b18ac6577
Model context limit auto-detection for LM Studio and Ollama LLM Providers (#4468)
* auto model context limit detection for ollama llm provider

* auto model context limit detection for lmstudio llm provider

* Patch Ollama to function and sync context windows like Foundry

* normalize how model context windows are cached from endpoint service
todo: move this into global utility class with MODEL_MAP
eager load models on boot to pre-cache them
add performance model improvements into ollama agent as well as apply n_ctx

* remove debug log

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-10-02 11:54:19 -07:00
Sean Hatfield
599a3fd8b8
Microsoft Foundry Local LLM provider & agent provider (#4435)
* add microsoft foundry local llm and agent providers

* minor change to fix early stop token + overloading of context window
always use user defined window _unless_ it is larger than the models real contenxt window
cache the context windows when we can from the API (0.7.*)+
Unload model forcefully on model change to prevent resource hogging

* add back token preference since some models have very large windows and can crash a machine
normalize cases

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-10-01 20:04:13 -07:00
Marcello Fitton
004327264a
Add stream options to Gemini LLM for usage tracking (#4466)
* Add stream options to Gemini LLM for usage tracking

* Update Gemini LLM to disable prompt token calculation

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-10-01 14:00:26 -07:00
Timothy Carambat
cd34063111
Patch OpenAI metrics (#4458)
resolves #4457
2025-09-30 15:19:34 -07:00
Timothy Carambat
c8f13d5f27
Enable custom HTTP response timeout for ollama (#4448) 2025-09-29 12:32:55 -07:00
Marcello Fitton
6855bbf695
Refactor Class Name Logging (#4426)
* Add className property to various LLM and embedder classes to fix logging bug after minification

* Fix bug with this.log method by applying the missing private field symbol
2025-09-25 15:34:19 -10:00
Timothy Carambat
9466f67162
Update the timeout value on all stream-timeout providers: (#4412)
- OpenRouter
- Novita
- CometAPI
updated to 3,000ms default with 500ms min
2025-09-19 08:52:20 -07:00
Sean Hatfield
1209606d9a
Migrate OpenAI LLM provider to use Responses API (#4404)
* migrate openai llm provider to use responses api

* add back image support

* dont recalc tokens from OpenAI since we get metrics back

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-09-18 21:15:19 -07:00
Marcello Fitton
50d4a198a4
Add User-Agent header on the requests sent by Generic OpenAI providers. (#4393)
* Add User-Agent header on the requests sent by Generic OpenAI providers.

* Moved getAnythingLLMUserAgent helper fn to server/endpoints/utils.js and changed fallback version string to "unknown"

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-09-17 13:08:18 -07:00
TensorNull
5922349bb7
feat: Implement CometAPI integration for chat completions and model m… (#4379)
* feat: Implement CometAPI integration for chat completions and model management

- Added CometApiLLM class for handling chat completions using CometAPI.
- Implemented model synchronization and caching mechanisms.
- Introduced streaming support for chat responses with timeout handling.
- Created CometApiProvider class for agent interactions with CometAPI.
- Enhanced error handling and logging throughout the integration.
- Established a structure for managing function calls and completions.

* linting

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2025-09-16 14:38:49 -07:00
Sean Hatfield
31a8ead823
Fix multimodal chats via openai compat api (#4135)
* fix multimodal chats via openai compat api

* lint

* add tests for multi-modal content in openai compat endpoint

* refactor to normalize how openai attachments are handled

* uncheck file

* rewrite tests, autodetect mime from dataurl, and spread attachments from prompt

* lint

* revert and fix tests

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2025-07-22 09:57:32 -07:00
Sean Hatfield
6d6bd14622
Moonshot AI LLM & agent provider (#4178)
* add moonshot ai LLM & agent provider

* fix moonshot agent calling

* handle attachments/fix moonshot llm provider

* update docs/example env

* add moonshot to onboarding privacy

* add moonshot to onboarding llm preference

* update privacy for moonshot ai

* update logo higher res

* remove caching and use modelmap
2025-07-22 09:56:51 -07:00
Fabio Nonato
0d7a7551b8
fix to support: feat2864 - using local credentials file with Amazon Bedrock (#3986)
* fix: feat2864

* patch default case

---------

Co-authored-by: nonatofabio <fnp@amazon.com>
Co-authored-by: timothycarambat <rambat1010@gmail.com>
2025-07-02 09:15:23 -07:00
Sean Hatfield
07129e81f8
Add option to disable streaming via env for generic openai provider (#4079)
* add option to disable streaming via env for generic openai provider

* move env check to streamingEnabled
2025-07-01 12:47:46 -07:00
Timothy Carambat
4eb951d40e
Fix model map staleness behavior or fallback (#3971)
* Fix model map staleness behavior or fallback

* patch url

* fix log

* dev build
2025-06-06 17:39:48 -07:00
Timothy Carambat
a57536b715
Handle invalid response bodies for ContextWindowFinder (#3896)
Handle invalid response bodies for contextwindowfinder
2025-05-27 15:40:06 -07:00
timothycarambat
2450e49ac3 hoisting cleanup for format var 2025-05-14 16:25:17 -07:00
timothycarambat
605910b76d forgot files for DPAIS 2025-05-14 15:26:14 -07:00
Timothy Carambat
e80492606a
Automatic Context window detection (#3817)
* Add context window finder from litellm maintained list
apply to all cloud providers, have client cache for 3 days

* linting
2025-05-14 11:03:19 -07:00
timothycarambat
492570dfed patch Azure image reading regressions
resolves #3811
2025-05-12 11:10:35 -07:00
Danny Steenman
5500fa2bc5
feat: support for iam roles for bedrock client (#2632)
* feat: implement iam role auth for bedrock

* fix: make client refreshes properly when switching between iam_user and iam_role

* checkout agent flow

* fix aiprovider for bedrock in agent use

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2025-05-06 13:48:15 -07:00
Tristan Stahnke
b64a77f29f
Refactor AWS Bedrock Provider for Multi-modal Support & Correct Token Limits (#3714)
* Fixed two primary issues discovered while using AWS Bedrock with Anthropic Claude Sonnet models:
- Context Window defaults to 8192 maximum, which isn't correct
- Multimodal stopped working when removing langchain, which was transparently handling image_url to a format sonnet expects.

* Ran `yarn lint`

* Updated .env.example to have aws bedrock examples too

* Refactor for readability
move utils for AWS specific functionality to subfile
add token output max to ENV so setting persits

---------

Co-authored-by: Tristan Stahnke <tristan.stahnke+gpsec@guidepointsecurity.com>
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-05-06 12:55:24 -07:00