The question is

  • (what is it)?
  • should Nim have a vfs?

and part of answering the above question

  • is it a Nim way of doing things
  • if so, should it be part of stdlib or a nimble package

What is a vfs

  • packaging multiple files into an embedded or archived file (zip, tar, bzip, ....)
  • keeping together all the files such as databases/configs/resources/etc for an application or project
  • simplify distribution of an application which requires multiple files
  • requires a means of treating the vfs like a normal file system
  • provides quasi obfuscation of files

Languages like Tcl use this methodology to package files into a single file for distribution. The real power is making the vfs a kit, which provides a means of executing the contents regardless of platform, so the same file can be deployed to multiple platforms, providing a platform independent file (but this is something more than just a vfs)

I approach this idea for Nim as meaning: "provide the means to deal with compressed/archive files like they are just a part of the file system" (but is this what everyone understands by a VFS?)

for example,
walkDir(/somepath/somezipfile.zip)

would walk the contents of the somezipfile.zip file as if it was some normal directory (instead of being a compressed file)

What do people think about this?

2017-01-11 05:23:25

This would be extremely cool!

There was some discussion on the forum about multiple modules per file, which could give some of the same advantages. And source-files archives treated as folders exist yet for Java and PHP (.jar and .phar).

2017-01-11 08:36:16

There are some interesting new ideas for what a programming language can offer to standardize, simplify, and secure the packaging of external dependency files, but I think embedding zip files in the executable is the wrong way to do this...

I encourage everyone to read up on IPFS [wp]; and to address this VFS matter (and other matters) from the perspective of building more fault-tolerant and delay-tolerant ("inter-planetary") network apps with as little centralization as possible.

  • ZIP and other general file compression is nearly useless for improving download performance, because the vast majority of the data being transferred these days is multimedia that is already in its ideal compression format. There are also specialized compression formats for collections of textfiles (based on PPM), structured data formats, etc - and I hope Nim will encourage their use - but compressing different kinds of files into one file has never been a good idea.
  • With modern protocols like HTTP/2, BitTorrent, and especially IPFS (of which I am a huge fan), there's almost no downside to downloading a list of separate files rather than one zip, and there are many advantages. One advantage is that you can prioritize which files should be downloaded first, so the user may begin using your app while the rest of it downloads, with specific pieces given priority or skipped based on user choices. And sometimes you may want to download different files (ex. video resolutions) based on what the app detects as it runs.
  • Cacheability the key to good performance. The fastest-loaded data is the data that you already have locally (or on your LAN, or on your ISP, or as few hops / light-(milli)seconds away as possible). If you are bundling an app and all its dependencies into one file, there's no way to tell if you already have some of those dependencies cached. With IPFS, when X people on the same WiFi hotspot (or even in all of New Zealand, depending on their latency) download your app, it will only download from your server once, and the rest will download from each-other. And when you release a code bugfix to your app / game, there's no reason for unchanged media components to be re-downloaded again.
  • Cachability of public BLOBs also improves reliability and privacy. Storage is cheap (and it's getting cheaper faster than just about any other hardware category), so we will soon see more and more caching NAS servers, and perhaps even automated preemptive caching. If you have terabytes of empty NAS space to burn (especially if you're an ISP or admin at a large network), why ever throw away any public IPFS file that someone on your network has downloaded, in case someone else ever wants it again. This means no upstream ISP, government, network failure, DoS, etc can censor this data, or even know if it was requested by a specific user or a preemptive caching AI.
  • Identifying files by checksum instead of URL improves security. This one should be obvious - you get exactly the file the author intended, and any man-in-the-middle attacks would be detected.
  • From the cachability perspective, the best way to transfer software packages is as source (like BSD ports, Gentoo Portage, etc), because it eliminates variations based on platform, compiler version, library versions, settings, USE flags, etc. So if we want a JAR equivalent for Nim, it should contain just the source code and multihash links to all dependencies. (I wonder if we can come up with an intermediate format that stores a Nim project as one binary AST file faster compilation on any platform.) The installer / nimble can start downloading the dependency BLOBs via IPFS in parallel while it compiles the binary.

And so I think the best Nim tool for packaging bundled dependencies would take a path or a local list of files, add them to IPFS, and store a lookup table of filenames and their multihash addresses - but not the files themselves.

The content-addressable links in that dependency manifest can then be downloaded by nimble, an OS package installer, or ideally the app itself. The bundled library / install tool could download the bundled files via whatever available alternative is best: local IPFS daemon, local ipfs command, perhaps as a Nim wrapper around a C implementation of libp2p, or via the HTTP gateway (ex) if all else fails.

Despite being based on IPFS, a high-level VFS library can be offered by Nim to access those files very easily from the developer's point of view, abstracting away all the hashes and just using their original filenames. The fetching / verifying of dependencies from IPFS could start in a separate thread as soon as the program starts, and the developer can specify what files are to be fetched first. If they weren't preemptively fetched by the install process, there would of course be a delay, just like with a Web browser while it downloads a Web page.

let introFiles = vfsWildcard("/levels/00/**")
if not introFiles.allReady():
  introFiles.setPriority(vfsPrioExclusive)
  loadingScreen.heading = "Downloading Level Data..."
  loadingScreen.show()
  while not introFiles.allReady():
    if vfsWildcard.fatalError:
      terminateWithError(vfsWildcard.errorMsg)
    loadingScreen.label2 = "Time left: " & introFiles.statusStr
    sleep(250)
  loadingScreen.hide()
let levelDataFile = vfsFileStream("/levels/00/header.msgpack.xz") # auto unxz
msgpack4nim.unpack levelDataFile, myLevelData
...

Just brainstroming...

2017-01-16 03:06:41
The idea ha nothing to do with decreasing download time or size, just easing dealing with sources, like treating the archive file as a directory, not having to unpack it first. 2017-01-16 06:05:06

Any high-level programming language typically provides a zip library with random access to the compressed files.

My reply was to the first post, which was brainstorming "a methodology to package files into a single file for distribution", with a simplified way for the program to load its bundled resources. This has been common since MS Visual Studio in the early 90s, but many things have changed since then.

I've been thinking about this a lot lately, with some of the ideas expressed in the braindump above. I've explained why that "single file" fetish has no practical benefits, and would result in every user downloading the same resources over again whenever one thing in your app was updated.

2017-01-16 07:39:56

Some thoughts:

The idea seems to be basically what Java does with *.jar files. And you say Tcl also does it. So why can they do this? They can because running both a Java application and a Tcl script requires a runtime environment (Tcl interpreter / JRE) on the target machine. So if the user must install this runtime environment anyway, it is minimal pain to add such a VFS to it.

Nim, on the other hand, does not depend on a special runtime environment (no more than C applications, anyway). So providing Nim artifacts in the form of a VFS would suddenly require users of Nim applications to install an additional tool. This is clearly an inconvenience and the question is whether the benefits of this VFS amortize this inconvenience.

In contrast to Java, Nim libraries are usually compiled into the application. I have never seen a Nim lib that is dynamically linked to other Nim code. So for libraries, the source code is distributed. This will, of course, not always be the case – when a Nim library requires non-code resource files on its own, it will need a way to tell an application that it needs those files. This is where a VFS approach may come in handy. Resources are usually necessary for GUI libraries, which is a special case, but also may be useful for things like i18n. So we may need a more sophisticated approach for including libraries than just nimble install.

But this can all be done at compile time with a bundling script. I think the better thing to have would be a bundler that can build native bundles for common target platforms, like an .app bundle for macOS, a .deb / .rpm package for some Linux distributions and an installer for Windows. This bundler would also have a minimal API for platform-independent access to application resources (which will be put in /usr/share on Linux, but inside the app bundle / installation directory in macOS / Windows).

So, basically, I am suggesting that an API that emulates some kind of VFS would be good but the backend should be the application resource management native to the target system instead of some zipped file.

2017-01-16 09:35:03
VFS seems to be at least a two-headed beast:
  • allowing an archive to be treated like a normal file system
  • packaging/bundling a distribution of some kind

For the packaging part, "platform independence" is an important goal.

(@LibMan) I wonder if we can come up with an intermediate format that stores a Nim project as one binary AST file (for) faster compilation on any platform.

This provides obfuscation as well as "platform independence". (Should the packaging always include the AST, and optionally the source code?)

2017-01-17 03:47:14
Why not use asar files for this?
2017-01-18 00:18:40

You may want to look at PhysicsFS: https://icculus.org/physfs/

It's a C library. And it looks like someone generated a nim wrapper for it: https://github.com/fowlmouth/physfs/blob/master/physfs.nim

2017-01-23 13:27:16