Commit Graph

15 Commits

Author SHA1 Message Date
Marcello Fitton
6855bbf695
Refactor Class Name Logging (#4426)
* Add className property to various LLM and embedder classes to fix logging bug after minification

* Fix bug with this.log method by applying the missing private field symbol
2025-09-25 15:34:19 -10:00
Timothy Carambat
2c19dd09ed
Native Embedder model selection (incl: Multilingual support) (#3835)
* WIP on embedder selection
TODO: apply splitting and query prefixes (if applicable)

* wip on upsert

* Support base model
support nomic-text-embed-v1
support multilingual-e5-small
Add prefixing for both embedding and query for RAG tasks
Add chunking prefix to all vector dbs to apply prefix when possible
Show dropdown and auto-pull on new selection

* norm translations

* move supported models to constants
handle null seelction or invalid selection on dropdown
update comments

* dev

* patch text splitter maximums for now

* normalize translations

* add tests for splitter functionality

* normalize

---------

Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
2025-07-22 10:07:20 -07:00
Sean Hatfield
1cd0cc32b8
Fix chunking/snippet logs for clarity (#4129)
update chunking/snippet logs for clarity
2025-07-11 10:54:54 -07:00
timothycarambat
c2c4f63643 bump cdn 2025-02-05 10:30:43 -08:00
timothycarambat
e192364d8d Migrate CDN download URL from S3 bucket 2025-01-07 12:09:14 -08:00
Timothy Carambat
244ce2e307
Prevent concurrent downloads on first-doc upload (#1267) 2024-05-02 10:15:11 -07:00
Timothy Carambat
bf435b2861
Adjust how text is split depending on input type (#1238)
resolves #1230
2024-04-30 10:11:56 -07:00
Timothy Carambat
6f52a2b729
Embedder download - fallback URL (#1056)
* Embedder download - fallback URL

* improve logging for native embedder
2024-04-06 11:49:15 -07:00
Timothy Carambat
d0a3f1e3e1
Fix present diminsions on vectorDBs to be inferred for providers who require it (#605) 2024-01-16 13:41:01 -08:00
Timothy Carambat
4f6d93159f
improve native embedder handling of large files (#584)
* improve native embedder handling of large files

* perf changes

* ignore storage tmp
2024-01-13 00:32:43 -08:00
Shuyoou
6faa0efaa8
Issue #543 support milvus vector db (#579)
* issue #543 support milvus vector db

* migrate Milvus to use MilvusClient instead of ORM
normalize env setup for docs/implementation
feat: embedder model dimension added

* update comments

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2024-01-12 13:23:57 -08:00
timothycarambat
049bfa14cb fix: fully separate chunkconcurrency from chunk length 2023-12-20 11:20:40 -08:00
timothycarambat
a7f6003277 fix: set lower maxChunk limit on native embedder to stay within resource constraints
chore: update comment for what embedding chunk means
2023-12-19 16:20:34 -08:00
Timothy Carambat
8cc1455b72
feat: add support for variable chunk length (#415)
fix: cleanup code for embedding length clarify
resolves #388
2023-12-07 16:27:36 -08:00
Timothy Carambat
88cdd8c872
Add built-in embedding engine into AnythingLLM (#411)
* Implement use of native embedder (all-Mini-L6-v2)
stop showing prisma queries during dev

* Add native embedder as an available embedder selection

* wrap model loader in try/catch

* print progress on download

* Update to progress output for embedder

* move embedder selection options to component

* forgot import

* add Data privacy alert updates for local embedder
2023-12-06 10:36:22 -08:00