Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix 48 creator query too narrow #124

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

iannesbitt
Copy link

Sorry for the weekend ping. This PR attempts to broaden support for the types of creator role that can be defined in Schema.org JSON-LD metadata. I'm having trouble getting all of the pieces running on my machine, so these changes have not been tested.

Notes:

  1. I added two sparql queries, one for the creator role definition without a list (sometimes used when there is only one creator) and one for the creator role definition WITH an ordered list. I commented with examples of each, and added tests and systemmetadata for each.

  2. I did not add a query for author email as Matt Matt demoed here, as I did not see a field for author email defined in the Solr schema. If someone can suggest what Solr field to use for email and/or ORCiD, I will add those to the query.

I'm adding @artntek to the list of reviewers since I imagine he has a working setup and can test the changes.

@iannesbitt
Copy link
Author

If these queries test ok, I can also add a PR for DataONEorg/dataone-cn-index#1 with the same changes.

@iannesbitt iannesbitt linked an issue Aug 20, 2024 that may be closed by this pull request
@artntek
Copy link
Collaborator

artntek commented Oct 16, 2024

I reviewed this PR, but please note caveats:

  1. I have no experience whatsoever with sparql queries. From my research, they look OK, but if you're unsure, please seek guidance from one who knows :-)
  2. You said:

"I did not add a query for author email as Matt demoed here, as I did not see a field for author email defined in the Solr schema. If someone can suggest what Solr field to use for email and/or ORCiD, I will add those to the query.

I agree there doesn't seem to be a schema field for author email (see these entries) - maybe @taojing2002 can help us out with this question?

  1. I got test failures for both your newly-added tests

  2. Finally - if this hasn't been deployed and tested, we should do so. I'm happy to collaborate with you on getting that up and running somewhere/somehow, when you're ready

String id = schemaOrgTestCreatorRoleListPid;
indexObjectToSolr(id, schemaOrgTestCreatorRoleList);

Thread.sleep(2*SLEEPTIME);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I understand you copied this from an existing test method, but...) is this definitely needed? In the best case this test will take a minimum of 18 seconds to run (2 * SLEEPTIME + SLEEP). The whole class will take nearly 2 minutes, best case.

Suggested solution is to remove Thread.sleep(2*SLEEPTIME) altogether, and then edit your loop (on line 582) as follows:

  1. Reduce the SLEEP time in that loop to, say, 500mS.
  2. Increase the value of TIMES to make the loop duration the same as it is now (i.e. 2 * 8 + 10 * 2), given the new shorter sleep time.

That way, flow will continue as soon as the test passes, with the potential to be a lot quicker.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(FYI, this is an approach we're gradually moving to in the metacat codebase, too)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in aab0578.

String id = schemaOrgTestCreatorRolePid;
indexObjectToSolr(id, schemaOrgTestCreatorRole);

Thread.sleep(2*SLEEPTIME);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as line 579

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 3151375.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to add your "schema_org_creator_role_name" and "schema_org_creator_list_role_name" beans to the application-context-json-ld.xml file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 87881f0.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both these new tests fail for me:

+++++++++++++++++++The value of the field authorGivenName from Solr is null
The expected value of the field authorGivenName is Ian

@artntek artntek added this to the 3.2.0 milestone Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sparql creator query is too narrowly defined
2 participants