* fix: Use eval_duration for output TPS calculations and add as a metric field
* refactor usage of eval_duration from ollama metrics
* move eval_duration to usage
* overwrite duration in ollama provider wip measureAsyncFunction optional param
* allow for overloaded duration in measureAsyncFunction
* simplify flow for duration tracking
---------
Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* auto model context limit detection for ollama llm provider
* auto model context limit detection for lmstudio llm provider
* Patch Ollama to function and sync context windows like Foundry
* normalize how model context windows are cached from endpoint service
todo: move this into global utility class with MODEL_MAP
eager load models on boot to pre-cache them
add performance model improvements into ollama agent as well as apply n_ctx
* remove debug log
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* Enable agent context windows to be accurate per provider:model
* Refactor model mapping to external file
Add token count to document length instead of char-count
refernce promptWindowLimit from AIProvider in central location
* remove unused imports
* refactor stream/chat/embed-stram to be a single execution logic path so that it is easier to maintain and build upon
* no thread in sync chat since only api uses it
adjust import locations
* add support for mistral api
* update docs to show support for Mistral
* add default temp to all providers, suggest different results per provider
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* WIP model selection per workspace (migrations and openai saves properly
* revert OpenAiOption
* add support for models per workspace for anthropic, localAi, ollama, openAi, and togetherAi
* remove unneeded comments
* update logic for when LLMProvider is reset, reset Ai provider files with master
* remove frontend/api reset of workspace chat and move logic to updateENV
add postUpdate callbacks to envs
* set preferred model for chat on class instantiation
* remove extra param
* linting
* remove unused var
* refactor chat model selection on workspace
* linting
* add fallback for base path to localai models
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>