Skip to content

Commit

Permalink
Avoid double scan
Browse files Browse the repository at this point in the history
  • Loading branch information
Weves committed Jan 15, 2025
1 parent 9019a64 commit 569794c
Showing 1 changed file with 2 additions and 5 deletions.
7 changes: 2 additions & 5 deletions backend/onyx/background/indexing/run_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,11 +123,8 @@ def strip_null_characters(doc_batch: list[Document]) -> list[Document]:
)
section.link = section.link.replace("\x00", "")

if section.text and "\x00" in section.text:
logger.warning(
f"NUL characters found in document text for document: {cleaned_doc.id}"
)
section.text = section.text.replace("\x00", "")
# since text can be longer, just replace to avoid double scan
section.text = section.text.replace("\x00", "")

cleaned_batch.append(cleaned_doc)

Expand Down

0 comments on commit 569794c

Please sign in to comment.