-
-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
caching and etags semantics and perhaps wider reuse patterns #21
Comments
https://github.com/ariga/entcache is a good example of some of the semantics applied to Ent and databases. Has multi level also: https://github.com/ariga/entcache/blob/master/internal/examples/multilevel/main.go
|
Hey @gedw99, welcome back! 👋 Hope you're doing well.
Anywhere is good with me. Worst case we can move the issue around 👍
Interesting, so in other words you're looking at triggering state changes in a GUI based on events from some backend server? e.g. subscribe to new issues on GitHub and auto-refresh your caches from those events. This idea might be possible in an FS – I think it might be in its own vein too, so like FS + event source in one object. Say, subscribe to a git repo's push events and auto-detect that updates are available for FS merging locally. Or take that a step further and perform safe merges automatically in real-time.
I'm not super experienced with S3's APIs – for example, turns out there's an older SOAP XML one and a newer REST JSON one which the It's not "supported" per se, but I see some potential in the idea and it seems possible. I think the eventing idea is orthogonal to file systems, so there's definitely plenty of room to make something that uses both! If you have ideas of how they'd integrate into Hackpad or how to expand HackpadFS, I'm all ears. HackpadFS in particular is open for extension in custom projects since it's based primarily around
You've actually hit on an interesting modern trend: Wasm running at the edge. Cloudflare workers make use of languages that can target Wasm for lots of powerful stuff. Caching at the edge (like a CDN) is a big draw for businesses. Moving data closer to the people that use it is a big deal, especially with large files.
Yes! Me too. It's a big reason why I wrote HackpadFS. The abstraction is pretty darn powerful and can be nested infinitely. If you're curious, I wrote an article on the FS side of Hackpad: https://blog.johnstarich.com/write-once-store-anywhere-extensible-file-systems-for-go-65c7c0949e74
That'd be amazing for Hackpad. We definitely need better compile performance too, so this can be viable. I'm hoping the Service Worker changes will bring down compile times enough for things like this. 🙏 (If only I could get it to stop crashing the runtime every time I tweak it! Wasm is rough sometimes...)
That's an interesting idea. I am curious how a system like this would behave in a high churn project. Maybe it could prompt before updating, sort of like the earlier git push event idea? Sounds like a radical new way to write code, so might need further thought. Could be fun to play with those ideas a bit, or see if anyone's tried something like it before. One idea I've bounced around is trying to make Hackpad work decently offline. In other words, give a great UX both online and offline with the Go Mod cache and "installed" Go version. |
Hey @JohnStarich Yep your understanding the concept.... BTW Minio has events via webhooks, NATS, or other transports. All the info here: https://docs.min.io/docs/minio-bucket-notification-guide.html I actually see 2 patterns:
2 can be used to tell the clients and then kick off 1 perhaps. Best of both worlds where only the changed parts of a file are copied downstream ( towards the edges of the network) using ranges. Binary diffs for binary stuff like WASM could be added to HackFS :) Here are two concrete examples of using this concept for real world stuff i am playing with: Golang can be compiled to WASM and WASI, like spin does: https://github.com/fermyon/spin
Application GUI, built from small file parts and data parts using https://github.com/ajstarks/deck and https://github.com/ajstarks/decksh
The file system being the important part because you get incremental updates flowing from the Origin Server to the EDGES ( both read only servers ) and READ / WRITE GUI and you get updating dashboards from Deck. For Spin, you want the File system events so that when a new wasm is compiled it propagates to the edges. |
That'd be pretty sweet. Do you know of any prior art, i.e. any other projects trying out eventing or perhaps using S3 notifications? I've used https://github.com/fsnotify/fsnotify before. It's pretty good for native platforms, though I don't think it is pluggable for new platforms like Wasm.
Thanks for the link, spin looks like a cool project. I'm not sure I totally understand the full picture yet, but I like the way you think 😄 Sounds like a hybrid fat client / fat server model, where processing can be split or shared.
Oh, do we know for sure tinygo doesn't work? We've still got your issue open #8 and I think there's promise still.
Interesting, I suppose you're thinking deck could be used to regenerate on source file updates. Pretty cool idea. Maybe that could be an offshoot as its own separate project for making presentations in the browser, like single-user Google Slides but all client-side. Great ideas. Sounds like we have a case to be made for things that listen to the events. So the next question might be, can the event listener interface be standardized? Hackpad itself could use ideas before standardization, but HackpadFS might need to nail it down first. Minio looks like a decent implementation of eventing and fswatch for a native one, but each are very different from one another. We might need to understand this space more thoroughly to define shared interfaces. We got really lucky the Go maintainers decided to start us off with familiar interfaces like opening and reading files on an |
I have used NATS with Minio before and it works great to get a notification of a file in minio being changed.
There are moves to do this here: https://github.com/prep/wasmexec
Slides client side, so people can own their data. Your FS and DB below on the client. So you can run OFFLINE !! Then, we just need a co-ordination server that users can run on google cloud run, so its there but costs nothing if you not using it. The idea is for it to be just FILES, because Deck is all File driven. THis is where the File Diff concept would work well. Deck has a server btw here: https://github.com/ajstarks/deck/blob/master/cmd/deckd/deckd.go I also think Deck is totally amazing in how it works under the hood. Decksh has its own little language that is really clever. I think there are huge possibilities. I think also that @ajstarks is working on a nice little Editor too: ajstarks/deck#10. Happy days - You can then work from your Desktop, Mobile or web from anywhere.
I agree with your perspective - an agnostic ( works on WASM Client or Server) API for Funcs and Events makes all this come together.
Yeah thats the right question i think too. We might get luck and find one but i suspect we need to invent it. Because a normal person wants to save money we can mount a File System within Google Cloud store. https://cloud.google.com/run/docs/using-network-file-systems. No idea what POSIX stuff is exposed in terms of File change events and event Content Diff / Binary Diff. I bet its expensive too. Other way is with your S3 driver. Run Deckd in Cloud Run, and when it needs a File or needs to react to a file change it just works. Minio and i assume other s3 systems are managing LOCKING and Concurrency so it should work. Hope i am not going around in circles repeating myself.. Its a long thread. I think that Deck is a great Use case to center around, because its a File based system, and the code is really clean too. I suspect we should work up some sort of Design Doc and also do a video chat with @ajstarks ??? |
Also I think this project might have what we are looking for. https://github.com/stv0g/gose/blob/master/pkg/notifier/notifier.go The develop has done a good job of working out all the differences between the s3 servers that exist. This Aspect is where it really shines also: stv0g/gose#19 (comment) |
You're good 😄 I like the brainstorming. Maybe the best path forward here is folks making new libraries to provide FS's like an "S3-with-events" FS? HackpadFS defines interfaces and contains an S3 example sub-module, but I think a community built FS could try things like this more easily. i.e. No strict compatibility or stability restrictions. You're welcome to kick it off by copying the S3 example to a new open source repo. 👍 Seems like you could iterate through several variants of event listening to see which ones work best. Once a few FS's pop up out there with event listening, we can definitely look at any emerging standards to include them here in HackpadFS too. I think that gives us the best chance of making a good choice. |
OK i will do that. thanks ! I am going to close this. Will open a new issue when things pop up. |
Hey @JohnStarich
Hope all is well.... I am really fascinated with your project and wider implications of this.
I am not sure if this issue / question shoudl be here or in the FS repo but here goes....
One thing i am working on is the relationship between retained mode GUI's and data caches. A retained mode gui can be conceptually designed like a data cache system that we all know and love. A gui has a template that is computed on to produce some markup. In a retained mode gui system you want to know if the template has changed. It woudl be a signal at the control plane level (if you get my drift), so you can do optimistic re rendering of that gui component instance, which is sort of like eager caching. Hope that this is making sense :)
Some background might also help... I know this is pretty out there...
The DOM is a browser is doing the retained mode diffing for you. It knows when you changed the HTML in the inspector, and magically updates the HTML Screen for you. But when you working at the WASM level and rendering with WASM to a Webgl canvas, you need to do this yourself..
So now i get to my question... The FS implementation is super cool because it can work with a WASM, local or remote context, but i was wondering if it supports etags style caching or other meta data semantics to tell the caller that you need to do a cache miss and refetch because your local data is old buddy !!
Golang Http FS has etags: https://go.googlesource.com/go/+/go1.8.3/src/net/http/fs.go; see line 119. So it can easily act as a cache to a DB called over http for example.
The reason i bring this up is because you also want to do the same thing inside the WASM environment for many use cases.
Let say in the WASM environment you have a FS ( that is indexeddb under the hood) that is sourced from S3. When the S3 source changes, i presume there is some sort of etag header to tell you. So you want when in the WASM environment there is a call to the WASM FS, it should realise that the local FS is now "dirty" or "invalidated" and so it should automatically get the data from S3, update the local file, and return that to the caller.
So i am wondering if this is currently supported.
It will be really interesting to see how this also related to the Web Worker and Service Worker stuff too Service Worker.
It will be really interesting to build quasi CDN like architectures where you can have caches at the WASM, local and remote level like a CDN style architecture. An thew also cross mix. Markup templates are dependent on data which can be dependent on the caache and its "turtles all the way down"..... I like the simplicity of this.
You also get into a interesting situation with Design time versus Runtime also.
I use the work real time because the IDE is running in a browser and doing compilation as you change code. At non real time, you working off already compiled golang.
At real time Design time, you want your goimport to be always latest and when someone else PR#s some code that you import you want to know NOW and see it and adapt. you want to get broken by them ...
At Real time Runtime, you want to use a static version, and not break.
The same does for data. YOu have static data ( json, binaries, etc) and you have dynamic data.
At Real time Design time, you want everything to be dynamic. Don't use any caching anywhere.
At real time Runtime, you want everything that is static to never update, and everything that's dynamic to use the cache mechanism.
The text was updated successfully, but these errors were encountered: