Commit Graph

9 Commits

Author SHA1 Message Date
Marcello Fitton
c4f19cec0e
Refactor LLMPerformanceMonitor.measureStream() to Use Options Object Pattern (#4786)
* Refactor LLMPerformanceMonitor to use options object for measureStream parameters

* Refactor invocations of `measureStream` to use options arguments

* Change invocation of `measureStream` in anthropic provider to use options argument

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-12-16 13:10:09 -08:00
Timothy Carambat
664f466e3f
4601 log model on response (#4781)
* add model tag to chatCompletion

* add modelTag `model` to async streaming
keeps default arguments for prompt token calculation where applied via explict arg

* fix HF default arg

* render all performance metrics as available for backward compatibility
add `timestamp` to both sync/async chat methods

* extract metrics string to function
2025-12-14 14:46:55 -08:00
Timothy Carambat
c2e7ccc00f
Reimplement Cohere models for basic chat (#4489)
* Reimplement Cohere models
- Redo LLM implementation to grab models from endpoint and pre-filter
- Migrate embedding models to also grab from remote
- Add records for easy context window lookup'

* fix comment
2025-10-03 18:28:20 -07:00
Timothy Carambat
e80492606a
Automatic Context window detection (#3817)
* Add context window finder from litellm maintained list
apply to all cloud providers, have client cache for 3 days

* linting
2025-05-14 11:03:19 -07:00
Timothy Carambat
dd7c4675d3
LLM performance metric tracking (#2825)
* WIP performance metric tracking

* fix: patch UI trying to .toFixed() null metric
Anthropic tracking migraiton
cleanup logs

* Apipie implmentation, not tested

* Cleanup Anthropic notes, Add support for AzureOpenAI tracking

* bedrock token metric tracking

* Cohere support

* feat: improve default stream handler to track for provider who are actually OpenAI compliant in usage reporting
add deepseek support

* feat: Add FireworksAI tracking reporting
fix: improve handler when usage:null is reported (why?)

* Add token reporting for GenericOpenAI

* token reporting for koboldcpp + lmstudio

* lint

* support Groq token tracking

* HF token tracking

* token tracking for togetherai

* LiteLLM token tracking

* linting + Mitral token tracking support

* XAI token metric reporting

* native provider runner

* LocalAI token tracking

* Novita token tracking

* OpenRouter token tracking

* Apipie stream metrics

* textwebgenui token tracking

* perplexity token reporting

* ollama token reporting

* lint

* put back comment

* Rip out LC ollama wrapper and use official library

* patch images with new ollama lib

* improve ollama offline message

* fix image handling in ollama llm provider

* lint

* NVIDIA NIM token tracking

* update openai compatbility responses

* UI/UX show/hide metrics on click for user preference

* update bedrock client

---------

Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
2024-12-16 14:31:17 -08:00
Timothy Carambat
99f2c25b1c
Agent Context window + context window refactor. (#2126)
* Enable agent context windows to be accurate per provider:model

* Refactor model mapping to external file
Add token count to document length instead of char-count
refernce promptWindowLimit from AIProvider in central location

* remove unused imports
2024-08-15 12:13:28 -07:00
Timothy Carambat
0b845fbb1c
Deprecate .isSafe moderation (#1790)
Add type defs to helpers
2024-06-28 15:32:30 -07:00
Timothy Carambat
01cf2fed17
Make native embedder the fallback for all LLMs (#1427) 2024-05-16 17:25:05 -07:00
Sean Hatfield
3caebc47b4
[FEAT] Cohere LLM and embedder support (#1233)
* getChatCompletion working WIP streaming

* WIP

* working streaming WIP abort stream

* implement cohere embedder support

* remove inputType option from cohere embedder

* fix cohere LLM from not aborting stream when canceled by user

* Patch Cohere implemention

* add cohere to onboarding

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2024-05-02 10:35:50 -07:00