-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow browsing and discovery of knowlege base in the search page #1073
Conversation
- One current issue in the Khoj application is that managing the files being referenced as the user's knowledge base is slightly opaque and difficult to access - Add a migration for associating the fileobjects directly with the Entry objects, making it easier to get data via foreign key - Add the new page that shows all indexed files in the search view, also allowing you to upload new docs directly from that page - Support new APIs for getting / deleting files
- Remove knowledge page from the sidebar - Improve speed and rendering of the documents in the search page
…knowledge-base-page
…loading the Api or the client
…knowledge-base-page
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments but otherwise the changes look good!
entries_to_update = [] | ||
for entry in entries: | ||
try: | ||
file_object = file_objects_map.get((entry.user_id, entry.file_path)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also delete orphaned file objects as part of this migration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined to say no. It's better to avoid complex data migrations taking place alongside database migrations to reduce any risk of long jobs. I'm on the fence even about running migrate_entry_objects
here, except that self-hosted users would get left out of the new setup if it didn't run automatically, which could break server-side expectations.
The consequence of not deleting orphaned file objects automatically is that self-hosted users might see those dangling files in their /search
page. I think that's okay in tradeoff, to avoid auto-running large data deletion operations live.
Here are some alternatives:
- We could add the management command instructions for a Docker-friendly environment in the release notes.
- We could re-use the older migrations stuff we'd built in
cli.py
when everything was based off of theconfig.yml
, alternatively, but it'll have to be upgraded to align with how things work now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's continue the discussion after merge. We can add/modify as needed before release.
…knowledge-base-page
Currently, it's rather opaque and difficult to substantially browse through the uploaded knowledge base. Effectively, you can only do this through the small file modal in the settings page.
Update to include all indexed files in the search page for viewing & deletion. Function to delete all files is still in the settings page.
Add a migration that associates file objects with
entry
s using a foreign key. Add a migration command that deletes dangling fileobjects.