Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could not find sphinx_index.conf error at end of indexing Noark 5 #18

Open
solfeggietto opened this issue Mar 14, 2022 · 3 comments
Open

Comments

@solfeggietto
Copy link

I have tried to import and index 5 Noark 5 extractions of the same arhival system, though different periods and parts.
They all fails at the end if indexing reporting "Could not find sphinx_index.conf" in \rapporter\yyyy\m\dd\nnnn\vedlegg.log

This error is experienced in all released versions:

  • Piql Insight v1.0.0
  • Piql Insight v1.1.0
  • Piql Insigh v1.2.0-beta3

v1.0.0 does not report this as an error, while v1.1.0 and v1.2.0-beta3 says:
"Failed to start indexer! See import log for more detail."

Final part of vedlegg.log

pdftotext -nopgbrk -enc UTF-8 F:\arkiv-work...\vedlegg\0000037731.txt: OK
Starter indeksering: index.cmd .\rapporter\2022\3\11\115000\ .\rapporter\2022\3\11\115000\vedlegg\ id2b1846ba39d0667231ee2b2770a8754
Indeksering feilet:
Could Not Find F:\arkiv-innsyn\1505_insight_v1.1.0-beta3\rapporter\2022\3\11\115000\sphinx_index.conf

inedexer.log shows this error below

  • What is the cause of this error
  • What does happen to our search and use of the index, useless or no implications?
  • What is te uuid/reference number below pointing/referring to (need to know for debugging)?

Sphinx 2.2.11-id64-release (95ae9a6)
Copyright (c) 2001-2016, Andrew Aksyonoff
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com/)

using config file '.\rapporter\2022\3\11\104942\sphinx.conf'...
indexing index 'id2b1846ba39d0667231ee2b2770a8754'...
ERROR: index 'id2b1846ba39d0667231ee2b2770a8754': source 'src_id2b1846ba39d0667231ee2b2770a8754': XML parse error: not well-formed (invalid token) (line=3569, pos=47, docid=9).
indexer.log

total 0 docs, 0 bytes
total 0.040 sec, 0 bytes/sec, 0.00 docs/sec
total 0 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 0 writes, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg

@oleliabo
Copy link
Collaborator

Could Not Find F:\arkiv-innsyn\1505_insight_v1.1.0-beta3\rapporter\2022\3\11\115000\sphinx_index.conf

Is it true that this file does not exist?

The sphinx_index.conf should be created by the index.cmd script. Can you try to run the command from the log

Starter indeksering: index.cmd .\rapporter\2022\3\11\115000\ .\rapporter\2022\3\11\115000\vedlegg\ id2b1846ba39d0667231ee2b2770a8754

manually and report any errors:

index.cmd .\rapporter\2022\3\11\115000\ .\rapporter\2022\3\11\115000\vedlegg\ id2b1846ba39d0667231ee2b2770a8754

If no errors are reported, remove first line in index.cmd (@echo off) and try again?

@solfeggietto
Copy link
Author

solfeggietto commented Mar 15, 2022

Could Not Find F:\arkiv-innsyn\1505_insight_v1.1.0-beta3\rapporter\2022\3\11\115000\sphinx_index.conf

Is it true that this file does not exist?

=> Yes, the file sphinx_index.conf is not created
=> I have tested a tiny sample Noark 5 extraction with only 1 document, and in this case sphinx_index.conf is created on same computer

So running index.cml gives the same error "could not find ....\sphinx_index.conf".

  • Does all this mean the file existed in start of the indexing, but mysteriosly diappeared?

@oleliabo
Copy link
Collaborator

Could you try to remove the first line in index.cmd (@echo off) and try to run manually again?

That should reveal some more errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants