Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: support network-based filesystem support for large files #4

Open
bmaranville opened this issue Jan 15, 2022 · 4 comments

Comments

@bmaranville
Copy link
Member

For files too large to load into memory, it would be nice to be able to load parts of a file on-demand over the network (see discussion in #2). The new WASM Filesystem being designed for emscripten seems like it is anticipating this use case: see design documents at emscripten-core/emscripten#15041

When this is implemented and generally available, look into adding this capability to h5wasm so that very large files can be retrieved piecewise and on-demand by URL.

@bmaranville
Copy link
Member Author

It appears that this feature already exists in emscripten: https://emscripten.org/docs/porting/files/Synchronous-Virtual-XHR-Backed-File-System-Usage.html

Prerequisites:

  • the server for the remote file has to support range requests
  • and has to return files uncompressed (though if they are served gzip, it just loads the whole file instead of using a range request, so nothing is broken but the benefit is lost)
  • h5wasm library has to be compiled with ENVIRONMENT=web,worker instead of just ENVIRONMENT=web as is done now. I will make the adjustment in the next release.

Then worker code like this can function:

import * as hdf5 from "../h5wasm/dist/esm/hdf5_hl.js";

var file;
const DEMO_FILEPATH="https://ncnr.nist.gov/pub/ncnrdata/ngbsans/202009/nonims294/data/sans114140.nxs.ngb?gzip=false";

self.onmessage = async function (event) {
    const { action, payload } = event.data;
    if (action === "load") {
        await hdf5.ready;
        hdf5.FS.createLazyFile('/', "current.h5", DEMO_FILEPATH, true, false);
        file = new hdf5.File("current.h5");
    }
    else if (action === "get") {
        await hdf5.ready;
        if (file) {
            self.postMessage(file.get("entry").attrs["NX_class"].value)
        }
    }
  };

@cavenel
Copy link

cavenel commented Dec 1, 2022

I needed the exact same feature, and this solution works really well!
But if I understand correctly, modules inside workers have never been implemented into Firefox (https://bugzilla.mozilla.org/show_bug.cgi?id=1360870). Is there a solution to convert the module code into normal JS code for Firefox? I am a bit stuck here...

@bmaranville
Copy link
Member Author

I haven't been providing an IIFE "plain javascript" build because it's easier to support ESM outputs, but you can get a version of the library that will work

  • in a worker
  • in Firefox

if you do a little bit of post-processing on the distributed library:
First, install esbuild, then compile the ESM outputs to IIFE:

npm i esbuild
npx esbuild --bundle dist/esm/hdf5_hl.js --outfile=h5wasm_iife.js --format=iife --global-name=h5wasm

The resulting file can then be loaded in (any!) worker with

importScripts("h5wasm_iife.js");

In fact, it's so easy I should probably include that build in a future release!

@cavenel
Copy link

cavenel commented Dec 2, 2022

That was indeed super easy and works like a charm. ❤️
(I hope Firefox gets modules in workers soon though, it would make life simpler for many)

Thank you so much for the prompt answer!! (And I guess this issue could be closed as everything works as intended)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants