Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
539 views
in Technique[技术] by (71.8m points)

zip - Simplest way to download and unzip files in Node.js cross-platform?

Just looking for a simple solution to downloading and unzipping .zip or .tar.gz files in Node.js on any operating system.

Not sure if this is built in or I have to use a separate library. Any ideas? Looking for just a couple lines of code so when the next zip file comes that I want to download in node, it's a no brainer. Feel like this should be easy and/or built in, but I can't find anything. Thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It's 2017 (October 26th, to be exact).

For an ancient and pervasive technology such as unzip I would expect there to exist a fairly popular, mature node.js unzip library that is "stagnant" and "unmaintained" because it is "complete".

However, most libraries appear either to be completely terrible or to have commits recently as just a few months ago. This is quite concerning... so I've gone through several unzip libraries, read their docs, and tried their examples to try to figure out WTF. For example, I've tried these:

Update 2020: Haven't tried it yet, but there's also archiver

Top Recommendation: yauzl

Works great for completely downloaded file. Not as great for streaming.

Well documented. Works well. Makes sense.

2nd Pick: node-stream-zip

antelle's node-stream-zip seems to be the best

Install:

npm install --save node-stream-zip

Usage:

'use strict';

var fs = require('fs');
var StreamZip = require('node-stream-zip');

var zip = new StreamZip({
  file: './example.zip'
, storeEntries: true
});

zip.on('error', function (err) { console.error('[ERROR]', err); });

zip.on('ready', function () {
  console.log('All entries read: ' + zip.entriesCount);
  //console.log(zip.entries());
});

zip.on('entry', function (entry) {
  var pathname = path.resolve('./temp', entry.name);
  if (/../.test(path.relative('./temp', pathname))) {
      console.warn("[zip warn]: ignoring maliciously crafted paths in zip file:", entry.name);
      return;
  }

  if ('/' === entry.name[entry.name.length - 1]) {
    console.log('[DIR]', entry.name);
    return;
  }

  console.log('[FILE]', entry.name);
  zip.stream(entry.name, function (err, stream) {
    if (err) { console.error('Error:', err.toString()); return; }

    stream.on('error', function (err) { console.log('[ERROR]', err); return; });

    // example: print contents to screen
    //stream.pipe(process.stdout);

    // example: save contents to file
    fs.mkdir(
      path.dirname(pathname),
      { recursive: true },
      function (err) {
        stream.pipe(fs.createWriteStream(pathname));
      }
    );
  });
});

Security Warning:

Not sure if this checks entry.name for maliciously crafted paths that would resolve incorrectly (such as ../../../foo or /etc/passwd).

You can easily check this yourself by comparing /../.test(path.relative('./to/dir', path.resolve('./to/dir', entry.name))).

Pros: (Why do I think it's the best?)

  • can unzip normal files (maybe not some crazy ones with weird extensions)
  • can stream
  • seems to not have to load the whole zip to read entries
  • has examples in normal JavaScript (not compiled)
  • doesn't include the kitchen sink (i.e. url loading, S3, or db layers)
  • uses some existing code from a popular library
  • doesn't have too much senseless hipster or ninja-foo in the code

Cons:

  • Swallows errors like a hungry hippo
  • Throws strings instead of errors (no stack traces)
  • zip.extract() doesn't seem to work (hence I used zip.stream() in my example)

Runner up: node-unzipper

Install:

npm install --save unzipper

Usage:

'use strict';

var fs = require('fs');
var unzipper = require('unzipper');

fs.createReadStream('./example.zip')
  .pipe(unzipper.Parse())
  .on('entry', function (entry) {
    var fileName = entry.path;
    var type = entry.type; // 'Directory' or 'File'

    console.log();
    if (//$/.test(fileName)) {
      console.log('[DIR]', fileName, type);
      return;
    }

    console.log('[FILE]', fileName, type);

    // TODO: probably also needs the security check

    entry.pipe(process.stdout/*fs.createWriteStream('output/path')*/);
    // NOTE: To ignore use entry.autodrain() instead of entry.pipe()
  });

Pros:

  • Seems to work in a similar manner to node-stream-zip, but less control
  • A more functional fork of unzip
  • Seems to run in serial rather than in parallel

Cons:

  • Kitchen sink much? Just includes a ton of stuff that's not related to unzipping
  • Reads the whole file (by chunk, which is fine), not just random seeks

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

56.7k users

...