Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle frontend loaded zip archive content (h5p, bloomd) that contains large files #12805

Draft
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

rtibbles
Copy link
Member

@rtibbles rtibbles commented Nov 7, 2024

Summary

  • Adds range request support to the zipcontent endpoint, so that we can serve audio and video files from inside archive files
  • Allows the zipcontent endpoint once again to serve all recognized archive formats, so we can defer to the zipcontent endpoint for large files
  • Updates frontend archive handling to fetch zip directory entries first, and then load zip data in segments, excluding large files
  • Updates h5p handling to defer to the zipcontent backend for large files (> 500KB)
  • Will update this for bloom handling when Fix various bugs in the Bloom Player implementation #12752 is merged

References

Fixes #9761

Reviewer guidance

Does this work for H5P files that contain large media assets like videos and audio?

Questions - this covers the previously encountered case where H5P files contained incredibly large individual files, but doesn't cover the case where the archive files themselves are just enormous, but contain lots of tiny files. Have we encountered any archives where this may be the issue? Is there any benefit to limiting the size of segments and lazily loading them on demand? These may be empirical questions best answered by looking at lots of H5P files.


Testing checklist

  • Contributor has fully tested the PR manually
  • If there are any front-end changes, before/after screenshots are included
  • Critical user journeys are covered by Gherkin stories
  • Critical and brittle code paths are covered by unit tests

PR process

  • PR has the correct target branch and milestone
  • PR has 'needs review' or 'work-in-progress' label
  • If PR is ready for review, a reviewer has been added. (Don't use 'Assignees')
  • If this is an important user-facing change, PR or related issue has a 'changelog' label
  • If this includes an internal dependency change, a link to the diff is provided

Reviewer checklist

  • PR is fully functional
  • PR has been tested for accessibility regressions
  • External dependency files were updated if necessary (yarn and pip)
  • Documentation is updated
  • Contributor is in AUTHORS.md

@github-actions github-actions bot added the DEV: backend Python, databases, networking, filesystem... label Nov 7, 2024
@pcenov
Copy link
Member

pcenov commented Nov 14, 2024

Hi @rtibbles,

I noticed the following issues:

  1. A lot of resources are not loading at all. Examples are 'Advent Calendar (beta)', 'Interactive book', 'Course Presentation', 'Virtual Tour (360)'. For others it takes noticeably longer to load.
large.mp4
  1. In some resources there are missing icons, examples are 'Speak the Words Set':

speak

And 'Unidad 1 - Herramientas para el empoderamiento' from the QA channel, where the player buttons are missing:

buttons

I've created the following test channel zapif-babuz using the provided example resources at https://h5p.org/content-types-and-applications so that you can have a look and replicate the issues.

@rtibbles
Copy link
Member Author

Thanks @pcenov - I guess there was a good reason I marked this as a draft!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DEV: backend Python, databases, networking, filesystem...
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Read media files more efficiently from zipped files
3 participants