Commit Graph

18 Commits

Author SHA1 Message Date
Marcello Fitton
8f33203ade
chore: add ESLint to /collector (#5128)
* add eslint config to /collector

* prettier formatting

* fix unused

* fix undefined

* disable lines

* lockfile

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2026-03-05 16:25:23 -08:00
Neha Prasad
3ecf218eea
feat: Add SSL certificate bypass support for self-hosted Confluence instances (#4219)
* Added bypassSSL parameter to constructor and implemented SSL bypass logic in fetchConfluenceData method

* Updated generateChunkSource function to include bypassSSL in the encrypted payload

* Updated the request body to include bypassSSL in the JSON payload sent to the backend

* Updated form submission to include bypassSSL parameter from the checkbox

* Added bypass_ssl: "Bypass SSL Certificate Validation" translation

* passed these parameters to fetchconfluencepage function for proper resync functionality

* allow ignore of SSL cert for Confluence

* add translations

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-11-25 14:32:10 -08:00
Sean Hatfield
05df4ac72b
Paperless ngx data connector (#4121)
* paperless ngx data connector

* wip resync paperless ngx

* fix generateChunkSource for resyncing paperless ngx

* lint

* Refactor Paperless-NGX connector
Fix issue with date rendering in tooltip + extended width
Move tooltip details to be column for more space

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2025-11-20 11:27:38 -08:00
timothycarambat
5173c75113 rescope validatedLink to local var 2025-10-07 12:08:53 -07:00
Timothy Carambat
cf3fbcbf0f
Improve URL handler for collector processes (#4504)
* Improve URL handler for collector processes

* dev build
2025-10-07 11:03:27 -07:00
Sean Hatfield
2b274c62b7
Obsidian data connector (#3798)
* add obsidian vault data connector

* lint

* add english translations

* normalize translations

* improve file parser and reader

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2025-05-12 13:45:27 -07:00
Timothy Carambat
fd4929b4d2
Feature/drupalwiki collector (#3693)
* Implement DrupalWiki collector

* Add attachment downloading and processing functionality (#3)

* linting

* Linting
Add citation image
small refactors
add URL for citation identifier

---------

Co-authored-by: em <eugen.mayer@kontextwork.de>
Co-authored-by: rexjohannes <53578137+rexjohannes@users.noreply.github.com>
Co-authored-by: Eugen Mayer <136934+EugenMayer@users.noreply.github.com>
2025-04-21 09:17:24 -07:00
Adam Setch
d63438fa61
chore: rename Github to GitHub (#3199)
* chore: rename Github to GitHub

Signed-off-by: Adam Setch <adam.setch@outlook.com>

* chore: rename Github to GitHub

Signed-off-by: Adam Setch <adam.setch@outlook.com>

* Undo some code changes for references

---------

Signed-off-by: Adam Setch <adam.setch@outlook.com>
Co-authored-by: timothycarambat <rambat1010@gmail.com>
2025-02-13 10:45:43 -08:00
Sean Hatfield
0bb47619dc
Allow 127.0.0.1 as valid URL for scraping (#2560)
* allow 127.0.0.1 as valid url for scraping

* update comments and lint

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2024-10-31 09:57:28 -07:00
Sean Hatfield
4488744850
Support more Confluence URL formats (#2118)
* support more confluence url formats

* use pattern matching for confluence urls and manual splitting as fallback

* rework entire Confluence flow to prevent issues with custom, local, and cloud spaces

* remove dep

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2024-09-25 16:12:17 -07:00
timothycarambat
5f477e0dbd remove log 2024-09-06 11:37:46 -07:00
Timothy Carambat
42235fcd8a
GitLab Hosted and Local Connector (#1932)
* Add support for GitLab repo collection as well as Github Repo collection
* Refactor for repo collectors to be more compact

---------

Co-authored-by: Emil Rofors <emirof@gmail.com>
2024-07-23 12:23:51 -07:00
Timothy Carambat
dc4ad6b5a9
[BETA] Live document sync (#1719)
* wip bg workers for live document sync

* Add ability to re-embed specific documents across many workspaces via background queue
bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled
UI for watching/unwatching docments that are embedded.
TODO: UI to easily manage all bg tasks and see run results
TODO: UI to enable this feature and background endpoints to manage it

* create frontend views and paths
Move elements to correct experimental scope

* update migration to delete runs on removal of watched document

* Add watch support to YouTube transcripts (#1716)

* Add watch support to YouTube transcripts
refactor how sync is done for supported types

* Watch specific files in Confluence space (#1718)

Add failure-prune check for runs

* create tmp workflow modifications for beta image

* create tmp workflow modifications for beta image

* create tmp workflow modifications for beta image

* dual build
update copy of alert modals

* update job interval

* Add support for live-sync of Github files

* update copy for document sync feature

* hide Experimental features from UI

* update docs links

* [FEAT] Implement new settings menu for experimental features (#1735)

* implement new settings menu for experimental features

* remove unused context save bar

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>

* dont run job on boot

* unset workflow changes

* Add persistent encryption service
Relay key to collector so persistent encryption can be used
Encrypt any private data in chunkSources used for replay during resync jobs

* update jsDOC

* Linting and organization

* update modal copy for feature

---------

Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
2024-06-21 13:38:50 -07:00
Sean Hatfield
612a7e1662
[FEAT] Website depth scraping data connector (#1191)
* WIP website depth scraping, (sort of works)

* website depth data connector stable + add maxLinks option

* linting + loading small ui tweak

* refactor website depth data connector for stability, speed, & readability

* patch: remove console log
Guard clause on URL validitiy check
reasonable overrides

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2024-05-14 12:49:14 -07:00
Sean Hatfield
348b36bf85
[FEAT] Confluence data connector (#1181)
* WIP Confluence data connector backend

* confluence data connector complete

* confluence citations

* fix citation for confluence

* Patch confulence integration

* fix Citation Icon for confluence

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2024-04-25 17:53:38 -07:00
Timothy Carambat
f4088d9348
RSA-Signing on server<->collector communication via API (#1005)
* WIP integrity check between processes

* Implement integrity checking on document processor payloads
2024-04-01 13:56:35 -07:00
Timothy Carambat
ecf4295537
Add ability to grab youtube transcripts via doc processor (#470)
* Add ability to grab youtube transcripts via doc processor

* dynamic imports
swap out Github for Youtube in placeholder text
2023-12-18 17:17:26 -08:00
Timothy Carambat
452582489e
GitHub loader extension + extension support v1 (#469)
* feat: implement github repo loading
fix: purge of folders
fix: rendering of sub-files

* noshow delete on custom-documents

* Add API key support because of rate limits

* WIP for frontend of data connectors

* wip

* Add frontend form for GitHub repo data connector

* remove console.logs
block custom-documents from being deleted

* remove _meta unused arg

* Add support for ignore pathing in request
Ignore path input via tagging

* Update hint
2023-12-18 15:48:02 -08:00