merlyn

Author	SHA1	Message	Date
Marcello Fitton	c4f19cec0e	Refactor `LLMPerformanceMonitor.measureStream()` to Use Options Object Pattern (#4786 ) * Refactor LLMPerformanceMonitor to use options object for measureStream parameters * Refactor invocations of `measureStream` to use options arguments * Change invocation of `measureStream` in anthropic provider to use options argument --------- Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2025-12-16 13:10:09 -08:00
Timothy Carambat	664f466e3f	4601 log model on response (#4781 ) * add model tag to chatCompletion * add modelTag `model` to async streaming keeps default arguments for prompt token calculation where applied via explict arg * fix HF default arg * render all performance metrics as available for backward compatibility add `timestamp` to both sync/async chat methods * extract metrics string to function	2025-12-14 14:46:55 -08:00
Marcello Fitton	a7da757c84	Migrate Azure OpenAI Integration To v1 API \| Enable Streaming for Reasoning Models in Azure OpenAI Basic Inference Provider (#4744 ) * Refactor Azure OpenAI integration to use OpenAI SDK and the v1 API \| Enable streaming for Azure Open AI basic inference provider * Add info tooltip to inform user about 'Model Type' form field * Add 'model_type_tooltip' key to multiple language translations * Validate AZURE_OPENAI_ENDPOINT in provider construction * remove unused import, update error handler, rescope URL utils --------- Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2025-12-10 18:56:55 -08:00
方程	90e474abcb	Support Gitee AI(LLM Provider) (#3361 ) * Support Gitee AI(LLM Provider) * refactor(server): 重构 GiteeAI 模型窗口限制功能,暂时将窗口限制硬编码,计划使用外部 API 数据和缓存 * updates for Gitee AI * use legacy lookup since gitee does not enable getting token context windows * add more missing records * reorder imports --------- Co-authored-by: 方程 <fangcheng@oschina.cn> Co-authored-by: timothycarambat <rambat1010@gmail.com>	2025-11-25 14:19:32 -08:00
Sean Hatfield	c913a2d68c	Prompt caching for Anthropic LLM and Agent providers (#4488 ) * prompt caching for anthropic llm and agent providers * add UI for control of ENV simplify implementation --------- Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2025-11-20 17:17:03 -08:00
Timothy Carambat	f0b3dab4c1	Simplify cache condition for LMStudio and Ollama to prevent race condition (#4669 ) closes #4597 resolves #4572 closes #4600 resolves #4599	2025-11-20 16:32:02 -08:00
Sean Hatfield	49c29fb968	Z.ai LLM & agent provider (#4573 ) * wip zai llm provider * cleanup + add zai agent provider * lint * change how caching works for failed models --------- Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2025-11-20 15:57:03 -08:00
Marcello Fitton	7a7ec969d7	Update Ollama AI Provider to Support Parsing "Thinking" Content From New Message Schema (#4587 ) * add className prop to OllamaAILLM * Enhance `OllamaAILLM.handleStream` to support parsing thinking content from the `message.thinking` property. * refactor thinking property handler patched ollama `@agent` flow calls --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2025-11-20 15:39:17 -08:00
Chetan Sarva	c169193fc4	feature: Support for AWS Bedrock API Keys (#4651 ) * feat: add AWS Bedrock API Key option to settings panel * feat: Bedrock API key auth method * fix: hide IAM note when using bedrock api key * move to camcelCase identifier for bedrock api key use linting --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2025-11-20 15:38:45 -08:00
jonathanortega2023	7a0c149d2e	fix: Use eval_duration for output TPS calculations in Ollama LLM provider (#4568 ) * fix: Use eval_duration for output TPS calculations and add as a metric field * refactor usage of eval_duration from ollama metrics * move eval_duration to usage * overwrite duration in ollama provider wip measureAsyncFunction optional param * allow for overloaded duration in measureAsyncFunction * simplify flow for duration tracking --------- Co-authored-by: shatfield4 <seanhatfield5@gmail.com> Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2025-11-20 13:02:47 -08:00
Timothy Carambat	cf76bad452	Implement full chat and `@agent` chat `user` indentificiation for OpenRouter (#4668 ) Implmenet chat and agentic chat user-id for OpenRouter resolves #4553 closes #4482	2025-11-20 12:38:43 -08:00
timothycarambat	0a1a5a216a	patch ollama context window error when unreachable	2025-10-06 16:25:06 -07:00
Timothy Carambat	c2e7ccc00f	Reimplement Cohere models for basic chat (#4489 ) * Reimplement Cohere models - Redo LLM implementation to grab models from endpoint and pre-filter - Migrate embedding models to also grab from remote - Add records for easy context window lookup' * fix comment	2025-10-03 18:28:20 -07:00
Timothy Carambat	8cdadd8cb3	Sync models from remote for FireworksAI (#4475 ) resolves #4474	2025-10-02 12:34:05 -07:00
Sean Hatfield	0b18ac6577	Model context limit auto-detection for LM Studio and Ollama LLM Providers (#4468 ) * auto model context limit detection for ollama llm provider * auto model context limit detection for lmstudio llm provider * Patch Ollama to function and sync context windows like Foundry * normalize how model context windows are cached from endpoint service todo: move this into global utility class with MODEL_MAP eager load models on boot to pre-cache them add performance model improvements into ollama agent as well as apply n_ctx * remove debug log --------- Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2025-10-02 11:54:19 -07:00
Sean Hatfield	599a3fd8b8	Microsoft Foundry Local LLM provider & agent provider (#4435 ) * add microsoft foundry local llm and agent providers * minor change to fix early stop token + overloading of context window always use user defined window _unless_ it is larger than the models real contenxt window cache the context windows when we can from the API (0.7.)+ Unload model forcefully on model change to prevent resource hogging add back token preference since some models have very large windows and can crash a machine normalize cases --------- Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2025-10-01 20:04:13 -07:00
Marcello Fitton	004327264a	Add stream options to Gemini LLM for usage tracking (#4466 ) * Add stream options to Gemini LLM for usage tracking * Update Gemini LLM to disable prompt token calculation --------- Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2025-10-01 14:00:26 -07:00
Timothy Carambat	cd34063111	Patch OpenAI metrics (#4458 ) resolves #4457	2025-09-30 15:19:34 -07:00
Timothy Carambat	c8f13d5f27	Enable custom HTTP response timeout for ollama (#4448 )	2025-09-29 12:32:55 -07:00
Marcello Fitton	6855bbf695	Refactor Class Name Logging (#4426 ) * Add className property to various LLM and embedder classes to fix logging bug after minification * Fix bug with this.log method by applying the missing private field symbol	2025-09-25 15:34:19 -10:00
Timothy Carambat	9466f67162	Update the timeout value on all stream-timeout providers: (#4412 ) - OpenRouter - Novita - CometAPI updated to 3,000ms default with 500ms min	2025-09-19 08:52:20 -07:00
Sean Hatfield	1209606d9a	Migrate OpenAI LLM provider to use Responses API (#4404 ) * migrate openai llm provider to use responses api * add back image support * dont recalc tokens from OpenAI since we get metrics back --------- Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2025-09-18 21:15:19 -07:00
Marcello Fitton	50d4a198a4	Add User-Agent header on the requests sent by Generic OpenAI providers. (#4393 ) * Add User-Agent header on the requests sent by Generic OpenAI providers. * Moved getAnythingLLMUserAgent helper fn to server/endpoints/utils.js and changed fallback version string to "unknown" --------- Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2025-09-17 13:08:18 -07:00
TensorNull	5922349bb7	feat: Implement CometAPI integration for chat completions and model m… (#4379 ) * feat: Implement CometAPI integration for chat completions and model management - Added CometApiLLM class for handling chat completions using CometAPI. - Implemented model synchronization and caching mechanisms. - Introduced streaming support for chat responses with timeout handling. - Created CometApiProvider class for agent interactions with CometAPI. - Enhanced error handling and logging throughout the integration. - Established a structure for managing function calls and completions. * linting --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2025-09-16 14:38:49 -07:00
Sean Hatfield	31a8ead823	Fix multimodal chats via openai compat api (#4135 ) * fix multimodal chats via openai compat api * lint * add tests for multi-modal content in openai compat endpoint * refactor to normalize how openai attachments are handled * uncheck file * rewrite tests, autodetect mime from dataurl, and spread attachments from prompt * lint * revert and fix tests --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2025-07-22 09:57:32 -07:00
Sean Hatfield	6d6bd14622	Moonshot AI LLM & agent provider (#4178 ) * add moonshot ai LLM & agent provider * fix moonshot agent calling * handle attachments/fix moonshot llm provider * update docs/example env * add moonshot to onboarding privacy * add moonshot to onboarding llm preference * update privacy for moonshot ai * update logo higher res * remove caching and use modelmap	2025-07-22 09:56:51 -07:00
Fabio Nonato	0d7a7551b8	fix to support: feat2864 - using local credentials file with Amazon Bedrock (#3986 ) * fix: feat2864 * patch default case --------- Co-authored-by: nonatofabio <fnp@amazon.com> Co-authored-by: timothycarambat <rambat1010@gmail.com>	2025-07-02 09:15:23 -07:00
Sean Hatfield	07129e81f8	Add option to disable streaming via env for generic openai provider (#4079 ) * add option to disable streaming via env for generic openai provider * move env check to streamingEnabled	2025-07-01 12:47:46 -07:00
Timothy Carambat	4eb951d40e	Fix model map staleness behavior or fallback (#3971 ) * Fix model map staleness behavior or fallback * patch url * fix log * dev build	2025-06-06 17:39:48 -07:00
Timothy Carambat	a57536b715	Handle invalid response bodies for `ContextWindowFinder` (#3896 ) Handle invalid response bodies for contextwindowfinder	2025-05-27 15:40:06 -07:00
timothycarambat	2450e49ac3	hoisting cleanup for format var	2025-05-14 16:25:17 -07:00
timothycarambat	605910b76d	forgot files for DPAIS	2025-05-14 15:26:14 -07:00
Timothy Carambat	e80492606a	Automatic Context window detection (#3817 ) * Add context window finder from litellm maintained list apply to all cloud providers, have client cache for 3 days * linting	2025-05-14 11:03:19 -07:00
timothycarambat	492570dfed	patch Azure image reading regressions resolves #3811	2025-05-12 11:10:35 -07:00
Danny Steenman	5500fa2bc5	feat: support for iam roles for bedrock client (#2632 ) * feat: implement iam role auth for bedrock * fix: make client refreshes properly when switching between iam_user and iam_role * checkout agent flow * fix aiprovider for bedrock in agent use --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2025-05-06 13:48:15 -07:00
Tristan Stahnke	b64a77f29f	Refactor AWS Bedrock Provider for Multi-modal Support & Correct Token Limits (#3714 ) * Fixed two primary issues discovered while using AWS Bedrock with Anthropic Claude Sonnet models: - Context Window defaults to 8192 maximum, which isn't correct - Multimodal stopped working when removing langchain, which was transparently handling image_url to a format sonnet expects. * Ran `yarn lint` * Updated .env.example to have aws bedrock examples too * Refactor for readability move utils for AWS specific functionality to subfile add token output max to ENV so setting persits --------- Co-authored-by: Tristan Stahnke <tristan.stahnke+gpsec@guidepointsecurity.com> Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2025-05-06 12:55:24 -07:00
Sean Hatfield	8912d0f0fc	Add option to control KoboldCPP max response tokens (#3746 ) add option to control koboldcpp max response tokens	2025-05-02 14:12:06 -07:00
Shinya Suzuki	cd900f9e4c	Replace @azure/openai with openai, and update openai to version 4.95.1 (#3691 ) * Replace @azure/openai to OpenAI lib * Remove @azure/openai dependency and update openai to version 4.95.1 * linting * update logging fix translation dictionary error * remove bad ENV key that DNE linting Patch Azure OpenAI Migrate Azure Agent provider to use OpenAI Schema for tool calling performance * unset * migrate azure to use default OAI stream handler --------- Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2025-04-29 11:21:39 -07:00
Shinya Suzuki	98c46c04e4	Update Azure AI options and model map with new model configurations (#3660 ) * Update Azure AI options and model map with new model configurations * linting --------- Co-authored-by: Shinya Suzuki <shinya.s.825@gmail.com> Co-authored-by: timothycarambat <rambat1010@gmail.com>	2025-04-16 09:08:40 -07:00
timothycarambat	1d1fb817b0	linting	2025-04-15 12:51:08 -07:00
Michał Rudziński	be27299897	handling of citations in openRouter provider #3581 (#3620 ) * handling of citations in openRouter provider #3581 * Update pplx enrichToken function comment Modify OR enrichToken to be generic handler function with optional params handle _just_ Perplexity in-line citations since no other models support this functionality * remove console log --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2025-04-15 10:57:09 -07:00
Timothy Carambat	1b59295f89	Refactor Gemini to use OpenAI interface API (#3616 ) * Refactor Gemini to use OpenAI interface API * add TODO * handle errors better (gemini) * remove unused code	2025-04-07 17:18:31 -07:00
Timothy Carambat	4ac900f645	Gemini model list sync (#3609 ) * Update defaultModels.js add gemma-3-27b-it to v1BetaModels * Update defaultModels.js 20250330 model update * Update defaultModels.js remove text embedding * Update name and inputTokenLimit modelMap.js * Update gemini to load models from both endpoints dedupe models decide endpoint based on expieremental status from fetch add util script for maintainers reduce cache time on gemini models to 1 day * remove comment --------- Co-authored-by: DreamerC <dreamerwolf.tw@gmail.com>	2025-04-07 13:45:16 -07:00
Timothy Carambat	78c83383d8	Overhaul AWS Bedrock provider (#3537 ) * Patch AWS Bedrock provider for newer models and performance * patch prompt constructor	2025-03-25 15:58:16 -07:00
Timothy Carambat	66b4bf2679	Add support for Anthropics /model endpoint (finally) (#3376 ) * Add support for Anthropics /model endpoint (finally) * dev	2025-02-28 13:29:43 -08:00
cnJasonZ	2aeb4c2961	Add new model provider PPIO (#3211 ) * feat: add new model provider PPIO * fix: fix ppio model fetching * fix: code lint * reorder LLM update interface for streaming and chats to use valid keys linting --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2025-02-27 10:53:00 -08:00
Skanda Kaashyap	d1354caccb	[FEAT] Add claude-3-7 (#3337 ) * add claude 3-7 sonnet * made all the changes everywhere * add 3-7-sonnet-latest model * lint --------- Co-authored-by: shatfield4 <seanhatfield5@gmail.com>	2025-02-25 12:52:17 -08:00
timothycarambat	12b43256a0	lint	2025-02-18 20:49:40 -08:00
Sushanth Srivatsa	3fd0fe8fc5	2749 ollama client auth token (#3005 ) * ollama auth token provision * auth token provision * ollama auth provision * ollama auth token * ollama auth provision * token input field css fix * Fix provider handler not using key sensible fallback to not break existing installs re-order of input fields null-check for API key and header optional insert on request linting * apply header and auth to agent invocations * upgrading to ollama 5.10 for passing headers to constructor * rename Auth systemSetting key to be more descriptive linting and copy * remove untracked files + update gitignore * remove debug * patch lockfile --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2025-02-18 16:00:17 -08:00
Timothy Carambat	cc3d619061	Add handling to reasoning models for Generic OpenAI connector (#3183 ) * Add handling to resoning models for Generic OpenAI connector resolves #3177 * linting	2025-02-12 10:28:44 -08:00

1 2 3 4

194 Commits