Commit Graph

229 Commits

Author SHA1 Message Date
PQ32 Developer
5bcef7d604 Patch 4: Increase OpenRouter promptWindowLimit fallback via AGENT_CONTEXT_WINDOW_FALLBACK
Both static and instance fallbacks increased from 4096 to 2,000,000 tokens.
When model isn't in models.json cache (e.g. Grok), it was falling back to
4096 tokens causing severe truncation of file reads.
Reuses AGENT_CONTEXT_WINDOW_FALLBACK env var from Patch 2.
2026-05-10 15:21:45 -07:00
Marcello Fitton
38206a14b3
fix: omit temperature param for Bedrock Claude Opus 4.7 (#5472)
* addconditionally pass temperature based on aws bedrock model id

* move to config

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2026-04-21 14:39:42 -07:00
Timothy Carambat
faf2dd998e
Add capability detection and streaming usage for Generic OpenAI provider (#5477)
- Add ENV-configurable model capabilities (tools, reasoning, vision,
  imageGeneration) via PROVIDER_SUPPORTS_* environment variables
- Add optional stream usage reporting via GENERIC_OPEN_AI_REPORT_USAGE
- Fix streaming tool calls for providers that send null tool_call.id
  (e.g., mlx-server) by generating fallback UUIDs
- Refactor supportsNativeToolCalling() to use centralized capabilities API
2026-04-21 09:31:58 -07:00
Timothy Carambat
e344109bcb
Update Lemonade Integration to support v10.1.0 changes (#5378)
Update Lemonade Integraion
Fix ApiKey nullification check causing hard throw
2026-04-07 11:21:28 -07:00
Marcello Fitton
0bfd27c6df
feat: add optional API key support for Lemonade provider (#5281)
* add API key param to Lemonade LLM Provider and Embedding Provider

* add LEMONADE_LLM_API_KEY to .env.example

* add api key to aibitat provider

* fix api key from being sent to frontend

* fix tooltip id

* add null fallback for `apiKey`

* remove console log

* add missing api keys

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2026-03-30 14:44:12 -07:00
Timothy Carambat
1b0add0318
add Dynamic max_tokens retreival for Anthropic models (#5255) 2026-03-23 15:45:22 -07:00
Mike Lambert
9d242bc053
Add User-Agent header for Anthropic API calls (#5174)
* Add User-Agent header for Anthropic API calls

Passes User-Agent: AnythingLLM/{version} to the Anthropic SDK
so Anthropic can identify traffic from AnythingLLM.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* remove test, simplify header default

* unset change to spread

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2026-03-23 15:19:33 -07:00
Timothy Carambat
868358597e Remove use_mlock from Ollama to solve WARN logs in ollama 0.17
resolves #5182
2026-03-10 09:08:05 -07:00
Timothy Carambat
4e3bcfc616
Add custom fetch to embedder for Ollama (#5180)
Refactor ollama timeout to be shared. Add custom fetch to embedder for ollama as well
2026-03-09 11:47:00 -07:00
Ryan
179a823ab1
Fix: Azure OpenAI model key collision (#5092)
* fix: Migrate AzureOpenAI model key from OPEN_MODEL_PREF to prevent the naming collision. No effort necessary from current users.

* test: add backwards compat tests for AzureOpenAI model key migration

* patch missing env example file

* linting

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2026-03-05 17:12:08 -08:00
Marcello Fitton
4a4378ed99
chore: add ESLint to /server (#5126)
* add eslint config to server

* add break statements to switch case

* add support for browser globals and turn off empty catch blocks

* disable lines with useless try/catch wrappers

* format

* fix no-undef errors

* disbale lines violating no-unsafe-finally

* ignore syncStaticLists.mjs

* use proper null check for creatorId instead of unreachable nullish coalescing

* remove unneeded typescript eslint comment

* make no-unused-private-class-members a warning

* disable line for no-empty-objects

* add new lint script

* fix no-unused-vars violations

* make no-unsued-vars an error

---------

Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2026-03-05 16:32:45 -08:00
Timothy Carambat
ee4b208f95 native tool calling detection for novita 2026-03-05 10:19:03 -08:00
Timothy Carambat
86431c6833
5112 or stream metrics and finish reason (#5117)
* update metric tracking for OR + fix finish_reason missing from transitive chunks

* linting + comments
closes #5113
resolves #5112
2026-03-02 18:53:29 -08:00
Timothy Carambat
a6ba5a4034
Lemonade integration (#5077)
* lemonade integration

* lemonade embedder

* log

* load model

* readme updates

* update embedder privacy entry
2026-02-27 11:02:38 -08:00
Timothy Carambat
fc29461718 resolve Ollama string strict num_ctx
resolves #5081
2026-02-27 09:20:48 -08:00
Timothy Carambat
ac0b1d401d
Native Tool calling (#5071)
* checkpoint

* test MCP and flows

* add native tool call detection back to LMStudio

* add native tool call loops for Ollama

* Add ablity detection to DMR (regex parse)

* bedrock and generic openai with ENV flag

* deepseek native tool calling

* localAI native function

* groq support

* linting, add litellm and OR native tool calling via flag
2026-02-26 13:37:56 -08:00
Timothy Carambat
d46c032787 Merge branch 'master' of github.com:Mintplex-Labs/anything-llm 2026-02-16 16:05:20 -08:00
Timothy Carambat
35af646a53 patch out no finish reason from https://github.com/microsoft/Foundry-Local/issues/423 2026-02-16 16:05:14 -08:00
Timothy Carambat
dba1be0600
add support for custom headers for LLM Generic OpenAI (#4999)
* add support for custom headers for LLM Generic OpenAI

* add env
2026-02-13 09:19:36 -08:00
Marcello Fitton
1ccf468158
fix: correct TPS calculation for Generic OpenAI provider with llama.cpp (#4981)
* add check for timings field on final chunk to override usage data

* refactor: extract llama.cpp timings into reusable private method

Move timings extraction into #extractTimings so it can be shared
by both streaming (handleStream) and non-streaming (getChatCompletion)
code paths.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* lint and cleanup

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2026-02-12 14:40:35 -08:00
Timothy Carambat
5fb1281891 patch out max_tokens from block output amount
resolves #3421
2026-02-12 14:20:08 -08:00
Timothy Carambat
fef5bf06ec
add provider field to chats (#4959) 2026-02-02 20:11:18 -08:00
Timothy Carambat
97b140b4b4
Update LMStudio LLM & Embedder for API token (#4948)
- Updates Option panels to be consistent for other providers
adds API key to all LMStudio API calls
2026-01-30 11:13:32 -08:00
Timothy Carambat
0032c4da22
SambaNova Integration (#4943)
* SambaNova Integration

* lint
2026-01-29 18:48:22 -08:00
Timothy Carambat
b8dd7bc97e
Support PrivateModeAI Integration (#4937)
* Support PrivateModeAI Integration

* tooltip for proxy
2026-01-29 12:01:11 -08:00
Neha Prasad
3fc2432684
fix: prevent Citations UI glitching during streaming chats (#4897)
* fix: prevent Citations UI glitching during streaming chats

* replaced random keys with stable keys

* simplify citation glitch fix

* Remove unneeded memo()

* Simplify key logic

* Replace Boolean(source) with !!source

* change cohere to behave with citations like other models

---------

Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
Co-authored-by: Marcello Fitton <macfittondev@gmail.com>
2026-01-29 10:44:34 -08:00
Timothy Carambat
d36dc0f8a5 fix log line 2026-01-27 10:51:16 -08:00
Timothy Carambat
fe78e1c667
Refactor Ollama context window setting (#4909) 2026-01-27 10:50:40 -08:00
Timothy Carambat
cd5530de39
[Chore] Autotranslation tool using DMR (#4907)
* update translations + DMR loading

* updates to misspellings
2026-01-27 09:29:37 -08:00
Timothy Carambat
9191179c9e remove race condition regression for FoundryLocal provider 2026-01-16 16:45:05 -08:00
Timothy Carambat
607b5faf74
Extract Model Table to component (#4871)
* Extract Model Table to component
Add provider icons to header rows and installed models
Light mode supported
Mapping for model name id hints to provider
Update DMR to filter chat models by ability since not available via hub API

* linting + dev

* fix incorrect import
2026-01-16 16:34:58 -08:00
Timothy Carambat
e07963d3fc minor refactor for context window finder 2026-01-16 12:55:33 -08:00
Timothy Carambat
ff7cb17e34
Improved DMR support (#4863)
* Improve DMR support
- Autodetect models installed
- Grab all models from hub.docker to show available
- UI to handle render,search, install, and management of models
- Support functionality for chat, stream, and agentic calls

* forgot files

* fix loader circle being too large
fix tooltip width command
adjust location of docker installer open for web platform

* adjust imports
2026-01-14 15:55:26 -08:00
Timothy Carambat
7c3b7906e7
support AWS bedrock agents with streaming (#4850)
* support AWS bedrock agents with streaming

* Add back error handlers from previous fix
2026-01-09 15:36:58 -08:00
Timothy Carambat
133b62f9f6
patch AWS credential issue in docker context (#4842)
path AWS credential issue in docker context
2026-01-08 17:06:49 -08:00
Marcello Fitton
c4f19cec0e
Refactor LLMPerformanceMonitor.measureStream() to Use Options Object Pattern (#4786)
* Refactor LLMPerformanceMonitor to use options object for measureStream parameters

* Refactor invocations of `measureStream` to use options arguments

* Change invocation of `measureStream` in anthropic provider to use options argument

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-12-16 13:10:09 -08:00
Timothy Carambat
664f466e3f
4601 log model on response (#4781)
* add model tag to chatCompletion

* add modelTag `model` to async streaming
keeps default arguments for prompt token calculation where applied via explict arg

* fix HF default arg

* render all performance metrics as available for backward compatibility
add `timestamp` to both sync/async chat methods

* extract metrics string to function
2025-12-14 14:46:55 -08:00
Marcello Fitton
a7da757c84
Migrate Azure OpenAI Integration To v1 API | Enable Streaming for Reasoning Models in Azure OpenAI Basic Inference Provider (#4744)
* Refactor Azure OpenAI integration to use OpenAI SDK and the v1 API | Enable streaming for Azure Open AI basic inference provider

* Add info tooltip to inform user about 'Model Type' form field

* Add 'model_type_tooltip' key to multiple language translations

* Validate AZURE_OPENAI_ENDPOINT in provider construction

* remove unused import, update error handler, rescope URL utils

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-12-10 18:56:55 -08:00
方程
90e474abcb
Support Gitee AI(LLM Provider) (#3361)
* Support Gitee AI(LLM Provider)

* refactor(server): 重构 GiteeAI 模型窗口限制功能,暂时将窗口限制硬编码,计划使用外部 API 数据和缓存

* updates for Gitee AI

* use legacy lookup since gitee does not enable getting token context windows

* add more missing records

* reorder imports

---------

Co-authored-by: 方程 <fangcheng@oschina.cn>
Co-authored-by: timothycarambat <rambat1010@gmail.com>
2025-11-25 14:19:32 -08:00
Sean Hatfield
c913a2d68c
Prompt caching for Anthropic LLM and Agent providers (#4488)
* prompt caching for anthropic llm and agent providers

* add UI for control of ENV
simplify implementation

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-11-20 17:17:03 -08:00
Timothy Carambat
f0b3dab4c1
Simplify cache condition for LMStudio and Ollama to prevent race condition (#4669)
closes #4597
resolves #4572
closes #4600
resolves #4599
2025-11-20 16:32:02 -08:00
Sean Hatfield
49c29fb968
Z.ai LLM & agent provider (#4573)
* wip zai llm provider

* cleanup + add zai agent provider

* lint

* change how caching works for failed models

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-11-20 15:57:03 -08:00
Marcello Fitton
7a7ec969d7
Update Ollama AI Provider to Support Parsing "Thinking" Content From New Message Schema (#4587)
* add className prop to OllamaAILLM

* Enhance `OllamaAILLM.handleStream` to support parsing thinking content from the `message.thinking` property.

* refactor thinking property handler
patched ollama `@agent` flow calls

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2025-11-20 15:39:17 -08:00
Chetan Sarva
c169193fc4
feature: Support for AWS Bedrock API Keys (#4651)
* feat: add AWS Bedrock API Key option to settings panel

* feat: Bedrock API key auth method

* fix: hide IAM note when using bedrock api key

* move to camcelCase identifier for bedrock api key use
linting

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2025-11-20 15:38:45 -08:00
jonathanortega2023
7a0c149d2e
fix: Use eval_duration for output TPS calculations in Ollama LLM provider (#4568)
* fix: Use eval_duration for output TPS calculations and add as a metric field

* refactor usage of eval_duration from ollama metrics

* move eval_duration to usage

* overwrite duration in ollama provider wip measureAsyncFunction optional param

* allow for overloaded duration in measureAsyncFunction

* simplify flow for duration tracking

---------

Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-11-20 13:02:47 -08:00
Timothy Carambat
cf76bad452
Implement full chat and @agent chat user indentificiation for OpenRouter (#4668)
Implmenet chat and agentic chat user-id for OpenRouter
resolves #4553
closes #4482
2025-11-20 12:38:43 -08:00
timothycarambat
0a1a5a216a patch ollama context window error when unreachable 2025-10-06 16:25:06 -07:00
Timothy Carambat
c2e7ccc00f
Reimplement Cohere models for basic chat (#4489)
* Reimplement Cohere models
- Redo LLM implementation to grab models from endpoint and pre-filter
- Migrate embedding models to also grab from remote
- Add records for easy context window lookup'

* fix comment
2025-10-03 18:28:20 -07:00
Timothy Carambat
8cdadd8cb3
Sync models from remote for FireworksAI (#4475)
resolves #4474
2025-10-02 12:34:05 -07:00
Sean Hatfield
0b18ac6577
Model context limit auto-detection for LM Studio and Ollama LLM Providers (#4468)
* auto model context limit detection for ollama llm provider

* auto model context limit detection for lmstudio llm provider

* Patch Ollama to function and sync context windows like Foundry

* normalize how model context windows are cached from endpoint service
todo: move this into global utility class with MODEL_MAP
eager load models on boot to pre-cache them
add performance model improvements into ollama agent as well as apply n_ctx

* remove debug log

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-10-02 11:54:19 -07:00