-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compressed packages #1278
Comments
Well, as far as I understand - the transfer of packages from npm is with archives that are then expanded. Also, I'm not sure this would actually use less disk space (perhaps in source control but then you shouldn't be checking in node_modules ideally anyway), the reason being that the archive will still need to be uncompressed somewhere when |
modules inside such an archive would need I'm thinking a userspace module that wraps |
Correctly, and cached at
This can be done temporaly in memory when calling |
require.extensions would be a good solution for this use case, but it's deprecated... :-( |
I'd say in general that memory is more of a premium than disk space so for most cases that balance probably isn't worth it for most, although giving the user this choice probably isn't a bad thing, I'm just not sure it warrants the effort, potential breakages and maintenance cost, for example I'm not sure how things like __dirname would then work (at least in Windows) for an in-memory file system. Just my 2c |
Oh, no, this is not necesary, once the Javascript objects are loaded, files are not needed anymore. Other thing would be if a package has some extra files like images or so, in that case a filesystem would be needed anyway, but that's not related to how |
Maybe cc @bmeck |
VideosEmpire Node: https://www.youtube.com/watch?v=k5r0kQlsDgU Node Summit: http://nodesummit.com/media/universal-distribution-via-node-archives/ @piranna please check out bmeck/noda-loader and bmeck/resource-shim. There are multiple reasons we want it to be a zip file and not a tarball. @silverwind modules using archives etc. should move to a resource loading approach we tried to implement fs virtualization and saw leaky abstractions as a problem (look at jxcore's issue list about all of these). Use something like the links above to get read-only resources that do not attempt to do things like preserve stat() compatibility / fs case sensitivity / disable half of the fs module cause you can't write/ etc. I have been doing a fair amount of testing this in userland and there are ways to do this well, but it does require some code changes. All the virtual fs things I have seen (jxcore, atom, nexe, http://www.py2exe.org/) have very hard to detect and solve issues with their fake fs. I'd rather have explicit changes that are blatant and somewhat easy to fix issues by making a resource API. As an added bonus you can start moving away from the archive based resources and introduce things like a generic way to load resources from the internet / purely memory (basically how the loader works right now) / etc. |
@piranna also forgot, you do need to keep the zip archive in memory otherwise you can get into mutation issues. Luckily we can just open the file rather than read it entirely. |
The 2 notable packages that have problems with this are node-school's workshopper and npm's bundled node-gyp due to trying to execute processes inside of their directory; the solution is to extract the executables to disk or to install them as siblings rather than inside of a module itself. |
How about new ways to do things... require("xxx")
That leads to the following problems (exposed above)
In the end we need to keep legacy code working, so I would add something to Hope "node://" idea solve many problems. |
I don't think this is necesary for Javascript... Or are you talking about resources (not exported Javascript code)?
Yeah, having an open file descriptor to the compresed file it's a nice idea, this way it would be prevented to be changed on the fly :-) |
This is intended for npm 3 or 4, and as I pointed on npm issue, on a flat tree hierarchy all dependencies would be compressed and only expand them and use their internal |
What advantages has this over |
@piranna I was referring to mutation of resources (including partial mutation of module code). Right now this is possible for example if you lazily
@llafuente I strongly urge not to use a virtual file system:
The major point is... changing fs to not refer to the system fs is a bad idea, and leads to implementation issues that are not solvable easily, logical issues that are not solvable easily, and misleading usage if you do full emulation. Having all fs operations fail inside of your archive I find preferable to having a sometimes it works approach to the fs module. It makes it fairly fast to fail, and also fast to find what to fix (you get big angry warnings). If you do want to continue at this route:
I still strongly urge you to go look at the issues of node/iojs/npm from filesystem differences (there are several). then go take a look at the projects I listed above for packaging issues related to the filesystem. |
Why not extract to |
@silverwind it depends on the purpose of compressed packages, for example:
|
Hi @bmeck, I felt that there is no truly solution that would work for all use cases, that's why I propose a clear path, that i believe covers most.
About protocol problems you mention.
It's a new protocol, it's our decision to choose if it's sensitive or not... (and should be because of linux)
Don't ""really"" needed, permissions are at zip level (I hope that should be enough)
"node://" isn't available at OS level, if you needed... I don't see any solution for this use case rather than extracting your (and maybe all) module on the fly in any implementation. spawn a process directly from the zip file or memory... i really don't know if it's even possible, even with the protocol...
I don't think anybody write data inside their modules, write to a temp folder and again not flag your module.
Are you sure:
I propose a new protocol that read one file for simplicity/efficiency/easy to implement. Manage multiple files will be a mess. When somebody need to read a file inside their module, you have to given them something. The real problem i see here, is that both solutions must coexist and that lead to many edge cases well exposed here, and we can't find solutions to all, because i think there is none. |
This smells to me, as it does not relate to actually changing your code to avoid problems when problems could arise. Its like telling someone "trust me", but not causing problems if it only mostly works.
if you use fs and fs.stat is not perfect, problems; if fs permissions are not perfect, problems; if fs limitations for modules that detect your os are not emulated... it goes on it is not about being good enough. it is about being perfect or being a liar.
This is fine, we can't assume any
I am unclear on what you mean by "root device".
I avoid this in bmeck/noda-loader but I do not use the
You only need to keep 1 fd open per zip file, once again look at noda.
I am unsure they must coexist, but I do agree that they each have trade offs. I side with explicit changes mainly because I don't like my APIs to work differently (slightly, sometimes, not exactly) based upon context; this is true even if the API works well enough for most cases. |
I will expose all problems i see and we just find a solution that solve all (or most)
@bmeck you have experience, expose more crazy things! The problems exposed are most of them academic, but there they are. I will try to study what Python does and the limitations, I'm sure there are many. |
going to just state that I actively want the Github is bad for comparison viewsI will put up a big comparison over the next day or 2 on a google spreadsheet that anyone can edit. |
also see node-forward/discussions#10 and node-forward/discussions#10 (comment)
|
Ideally packages should be read-only and somewhat black-boxes, and most of them are, so package mutation should not be a problem and the compressed packages would encourage this. The only "valid" cases where a module would need to write some archives in itself are compiled ones, and in this case it's easier to left them decompressed or that they are compressed again after the compilation of the module, and in that case they could be easily identified because they have an "install" entry on the "scripts" field of "package.json" or a "build.gyp" file.
URIs are stackable, so what you propose could be somewhat doable with a |
I also want to add that this could help a lot with reducing inode usage. It's not uncommon for node_modules to have a lot of files in them. These tend to multiply and result in hundreds of thousands of inodes being created. Most filesystems have a limited number of inodes at creation-time and this limit can be easily reached by node projects. |
It might be worth moving this discussion over to iojs/NG. |
For reference, Python has this feature implemented by the zipimport module since Python 2.3 :-) |
@bmeck I read your spreadsheet over at https://docs.google.com/spreadsheets/d/1PEDalZnMcpMeyKeR1GyiUhOfxTzQ0D05uq6W9E9AVYo/edit#gid=0, and the resource API seems most reasonable to me, what's all involved for implementation? Any Caveats? |
@Fishrock123 https://github.com/bmeck/noda-loader + https://github.com/bmeck/resource-shim has it all setup, would need a route to get it into core / code coverage for it |
Closing due to inactivity. No one followed up with working code so far and the intrusive changes needed to the module loader would probably make it a tough sell. I don't envy whoever wants to tackle this. |
I'm still interested on this, I only left it for discussion. I don't think
|
@piranna noda works fine for now, but I am not attempting to push it into core. the main purpose of my use case was superseded by using docker. |
It's a shame :-( I'm too much busy at this moment with work and with |
I can't believe this topic died so easily! Being productive in node environment is really hard and this makes it a lot harder. |
Maybe you could implement it? I think it shouldn't be too difficult to do a El 21/8/2016 16:22, "Namek" [email protected] escribió:
|
I don't really see it as a small task. As I have some vision, I'll share it:
The dilemma I can see here is how exactly to package modules into files because as I said simple vue.js webpack template project has about 770 dependencies ( If I misunderstood something then it's probably due to my too flat knowledge of node environment.
I don't feel like doing this. There some people having more time and still developing node/npm, I'm not one of them. I'm just curious why people aren't discussing this for sake of performance of their work. |
You are confusing Node.js with npm. For Node.js is just a matter of El 22/8/2016 22:25, "Namek" [email protected] escribió:
|
https://github.com/bmeck/noda-loader has a |
That's because tar doesn't have a central directory, but it's possible to El 22/8/2016 23:09, "Bradley Meck" [email protected] escribió:
|
If you want to read all of it, note that node_modules dirs get particularly large when you are putting dependencies inside of them. |
I don't want to read them, but instead walk over them. It's only needed the El 22/8/2016 23:17, "Bradley Meck" [email protected] escribió:
|
Did anyone solve this? Does this help?? |
This does not have enough likes! I'm tired of 5-minute transfers with a simple Angular 6 CLI project with node_modules installed between a HDD and a SSD. Even inside SSDs, the file count is huge. Any NPM project is a pain to compress, to move, to copy, whatever. Any progress on this? |
@darkguy2008 see https://github.com/WICG/webpackage which should be compatible with Node, but isn't coming in short term due to needing to be ratified. |
I sometimes feel like a |
Thanks to linking my feature request here.
|
(Originally posted at Node.js)
Allow to require() compressed packages as dependencies inside ǹode_modules folder, the same way Python does. The idea is to use less disk space and make transfer of projects faster, since it will only move one file instead of a set of them. Also it would allow to checksum them, obviously in this case the comprossed package would not have inside its dependencies except if sringwash or bundleDependencies are being used, but the Node.js packages resolution algorythm (search for a node_modules in the current folder and all its parents until /) would be capable to solve this.
To do so, the only change would be that when a compressed (.zip?) file is found inside the node_modules folder, expand it dynamically and process the package.json file the same way it would do with folders.
The text was updated successfully, but these errors were encountered: