Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Collection takes 30-45 seconds & sometimes times out #2914

Closed
tdonohue opened this issue Apr 9, 2024 · 8 comments · Fixed by DSpace/DSpace#9462
Closed

Create Collection takes 30-45 seconds & sometimes times out #2914

tdonohue opened this issue Apr 9, 2024 · 8 comments · Fixed by DSpace/DSpace#9462
Labels
bug component: Collection Collection display or editing high priority testathon Reported by a tester during Community Testathon
Milestone

Comments

@tdonohue
Copy link
Member

tdonohue commented Apr 9, 2024

Describe the bug
This issue is reproducible on both https://sandbox.dspace.org (running pre-8.0) and https://demo.dspace.org (running 7.x). When attempting to create a new Collection, the request always takes a long time (at least 30-45 seconds) and sometimes times out with a 504 Gateway Timeout error.

This issue only occurs for creation of Collections. Creating Communities or Items seems uneffected.

To Reproduce
Steps to reproduce the behavior:

  1. Login as an Admin to either Sandbox or Demo
  2. In your browser, open up your DevTools -> Network tab (useful to watch the create step)
  3. In the Admin menu, choose "New -> Collection". Select any Community as the parent
  4. Enter a title and click "Save". The page will appear to not respond. Eventually either a 504 or 200 will return, but only after about 30-45 seconds.
    • In your DevTools -> Network tab you'll see that a POST request was made to /server/api/core/collections?parent=[community-uuid]. It will remain "Pending" for the entire 30-45 seconds.
    • On Sandbox, the request will sometimes return as successful (200 OK), but only after about 30-45 seconds.
    • On Demo, the request almost always times out (504 error returned). However, the Collection will have been created successfully behind the scenes. So, if you search for the collection after seeing a 504 error, you will find that it exists.
    • In the backend dspace.log file, the POST request appears to succeed almost immediately. It's unclear why it stays "Pending" for so long, and whether the bug is in the frontend or backend.

Expected behavior
Creating a Collection obviously should be as fast as creating a Community.

Related work
It's unclear what caused this bug, but the bug exists in both dspace-7_x and main. So, that implies it was caused by a bug fix which was applied to both branches.

@tdonohue tdonohue added bug high priority component: Collection Collection display or editing testathon Reported by a tester during Community Testathon labels Apr 9, 2024
@github-project-automation github-project-automation bot moved this to 📋 To Do in DSpace 8.0 Release Apr 9, 2024
@toniprieto
Copy link
Contributor

Hi @tdonohue, reading the description of the issue, I think it could be related to a consumer configured to reload the submission configurations when a collection is edited/created. See:

https://github.com/DSpace/DSpace/blob/main/dspace/config/dspace.cfg#L833-L835

I think that this consumer could be refactored to avoid reload the configurations multiple times during the creation of a collection, moving the reload process to the end() function.

I am not able to reproduce it locally, but I suppose that it happens because in demo and sandbox reloading the submission configurations takes more time.

I will send a PR with the proposed changes to see if it could solve or mitigate the issue.

@tdonohue
Copy link
Member Author

tdonohue commented Apr 9, 2024

@toniprieto : That would be wonderful! I had actually just realized myself that the likely cause is the SubmissionConfigConsumer, but I had not determined how to fix it. If you could create a quick PR to help mitigate the issues, that would be very much appreciated! I'd ensure it gets reviewed/tested quickly.

@tdonohue
Copy link
Member Author

NOTE: After applying the fix in #9462, the "Create Collection" step is much faster... though it's still slightly slower than I'd like to see. It now takes ~6 seconds instead of ~30-45 seconds.

We may need to consider ways to speed this up further...or possibly run the submission config "reload" behind the scenes (instead of waiting on it to complete before returning). But, for now, the significant performance issue is fixed.

@tdonohue tdonohue added this to the 8.0 milestone Apr 10, 2024
@tdonohue tdonohue moved this from 📋 To Do to ✅ Done in DSpace 8.x and 7.6.x Maintenance Apr 10, 2024
@tdonohue tdonohue modified the milestones: 8.0, 7.6.2 Apr 10, 2024
@paulo-graca
Copy link
Contributor

There is an issue DSpace/DSpace#9402 that is addressed with: DSpace/DSpace#9415 . This PR fixes an issue that limit you to a maximum of 10 collections per entity type (default solr rows limit). But that one isn't in DS source yet.

By choosing Solr as a solution for enabling you to set a form based on the defined entity types at collections,
we intended to speed results. Even with DSpace/DSpace#9415 30-45 seconds (considering we are using solr) it seems too much time. Perhaps you have a big repository (with a large number of collections) with a large number of different entities types, that could explain.

Also, does the repository changes the configuration?

event.consumer.submissionconfig.filters = Collection+Modify_Metadata

Currently, with our local migrated repositories we didn't verified this issue. With our local repositories it takes about 15 sec to create a new collection and 5 sec to effectively edit it. Perhaps the issue could also be with policy creation.

@paulo-graca
Copy link
Contributor

A better implementation to the one I did with forms loading, with some associated effort, would be to also consider an alternative way how submission forms are made available. Currently, it's based on the configured collections, specifically on the collection's Handle, but, perhaps it makes sense to do a runtime validation based on the entity type for the provided collection.

@tdonohue
Copy link
Member Author

tdonohue commented Apr 11, 2024

@paulo-graca : I can verify the behavior here was on sandbox.dspace.org and demo.dspace.org. Those sites both use default settings in dspace.cfg with regards to the event.consumer.submissionconfig.filters.

I've not yet narrowed down why this submissionconfig consumer is adding so much time to the Collection creation process, but it is down to ~6 seconds now (after merging DSpace/DSpace#9462). That's a big improvement over 30+ seconds, but I'd still like to see if we can make it faster (as Communities are created much more quickly, for example).

I'll take a closer look at DSpace/DSpace#9415 as well to see if there are improvements there...but at a glance, I'm not sure whether that will impact performance. It does seem like an important bug fix though which we should get into 8.0 and 7.x

@toniprieto
Copy link
Contributor

@paulo-graca I think it's a good idea to implement what you suggested in your last comment. In version 8.x, a change that allows configuring forms at the community level has been included (DSpace/DSpace#9259). With this change, the function that returns the Item Submission process used by a collection receives a collection as a parameter (previously it received a handle). See:

https://github.com/DSpace/DSpace/blob/main/dspace-api/src/main/java/org/dspace/app/util/SubmissionConfigReader.java#L228

Having the collection makes it simpler to retrieve the entity type and read a map with the relation between entityTypes and item submission process that could be built during the initial load. This way, there would be no need to perform a submission config reload when a collection is edited/created and the issue DSpace/DSpace#9402 should also be resolved.

@toniprieto
Copy link
Contributor

@paulo-graca @tdonohue I've had time these days to implement the approach described in the last comment, and I've sent a PR: DSpace/DSpace#9478 The implementation is very similar to DSpace/DSpace#9259
Could you take a look to see if this new approach is appropriate?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug component: Collection Collection display or editing high priority testathon Reported by a tester during Community Testathon
Projects
3 participants