Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CI #91

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Fix CI #91

wants to merge 1 commit into from

Conversation

mweidling
Copy link

@mweidling mweidling commented Apr 26, 2022

The mets.xml of the grenzboten-test directory uses a prefix without corresponding dir in fileGrp/@USE. This PR fixes the typo.

@mweidling mweidling requested a review from kba April 26, 2022 05:55
@mweidling mweidling changed the title fix: typo WIP: fix: typo Apr 26, 2022
@mweidling
Copy link
Author

mweidling commented Apr 26, 2022

After having altered the OCR-D- prefix I get the following error for the gutachten dir:

<report valid="false">
  <error>PAGE_TEMP1: Line 123: Element '{http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15}UnorderedGroupIndexed': Missing child element(s). Expected is one of ( {http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15}UserDefined, {http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15}Labels, {http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15}RegionRef, {http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15}OrderedGroup, {http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15}UnorderedGroup ).</error>
  <notice>fileGrp USE does not begin with 'OCR-D-': IMG</notice>
  <notice>fileGrp USE does not begin with 'OCR-D-': TEMP1</notice>
  <notice>fileGrp USE does not begin with 'OCR-D-': TEMP2</notice>
</report>

@mweidling mweidling marked this pull request as draft April 26, 2022 06:15
@mweidling mweidling changed the title WIP: fix: typo Fix CI Apr 26, 2022
@mweidling
Copy link
Author

The reason for the error mentioned above seems to be https://github.com/OCR-D/assets/blob/master/data/gutachten/data/TEMP1/PAGE_TEMP1.xml#L123. Removing it results in a successful report.

@kba Is this real data that shouldn't be altered or can we just add the missing child attribute?

Copy link
Member

@kba kba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for the error mentioned above seems to be https://github.com/OCR-D/assets/blob/master/data/gutachten/data/TEMP1/PAGE_TEMP1.xml#L123. Removing it results in a successful report.

@kba Is this real data that shouldn't be altered or can we just add the missing child attribute?

No, we actually need that in the test_empty_groups_to_regionrefindexed test in tests/model/test_ocrd_page.py. We might want to exclude the gutachten folder from validation in the Makefile, i.e. adding -not -name "gutachten" to https://github.com/OCR-D/assets/blob/master/Makefile#L51

But the OCRD vs OCR-D typo is spot on, can be merged.

Unfortunately, I haven't updated the assets in core in a while and there are some issues now unrelated to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants