This content originally appeared on Stefan Judis Web Development and was authored by Stefan Judis
You might know that I'm running a Twitter bot called @randomMDN. Every few hours, the bot fetches the sitemap of MDN and tweets a random page.
It was running without a problem for two years, but recently it broke. The reason was that MDN changed the sitemap from https://developer.mozilla.org/sitemaps/en-US/sitemap.xml
to https://developer.mozilla.org/sitemaps/en-US/sitemap.xml.gz
. It's now a gzipped file.
It took me a while to figure out how to handle this new file format. For future reference, here's a snippet that shows the unzipping in Node.js.
The snippet uses got to make HTTP requests and node-gzip to fetch the gzipped sitemap and transform it to a string.
const got = require('got');
const { ungzip } = require('node-gzip');
const SITEMAP_URL =
'https://developer.mozilla.org/sitemaps/en-US/sitemap.xml.gz';
// fetch file
const { body } = await got(SITEMAP_URL, {
responseType: 'buffer',
});
// unzip the buffered gzipped sitemap
const sitemap = (await ungzip(body)).toString();
Maybe that helps someone in the future. ? Have fun!
Reply to Stefan
This content originally appeared on Stefan Judis Web Development and was authored by Stefan Judis
Stefan Judis | Sciencx (2021-02-08T23:00:00+00:00) How to download and unzip gz files in Node.js (#snippet). Retrieved from https://www.scien.cx/2021/02/08/how-to-download-and-unzip-gz-files-in-node-js-snippet/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.