Streaming data from AWS S3 using NodeJS Stream API and Typescript

AWS s3 SDK and NodeJS read/write streams makes it easy to download files from an AWS bucket. However, what if you wanted to stream the files instead?

There is a timeout on connecting to the AWS instance set to 120000ms (2 minutes). Unless you have ve…


This content originally appeared on DEV Community and was authored by Austin Burger

AWS s3 SDK and NodeJS read/write streams makes it easy to download files from an AWS bucket. However, what if you wanted to stream the files instead?

There is a timeout on connecting to the AWS instance set to 120000ms (2 minutes). Unless you have very small files, this just won't cut it.

One option is to simply raise that timeout, but then how much should you raise it? Since the timeout is for the total time a connection can last; you would have to either make the timeout some ridiculous amount, or guess how long it will take to stream the file and update the timeout accordingly. This is also not taking into account the stream closing due to HTTP(S)'s own timeout reasons as well.

Instead of making guesses and fighting random bugs, we can make use of the NodeJS Stream API and create our very own custom readable "smart stream".

The idea is to create a stream that uses the power of AWS s3
ability to grab a range of data, close the connection, then grab another range of data. This stream will pause when its buffer is full, only requesting new data on an as needed basis.

Before we begin, I am assuming you have used AWS s3 SDK to download files successfully and are now wanting to convert that functionality to a proper stream. As such, I will omit the AWS implementation and instead show a simple example of how, and where, to instantiate this "smart stream" class

We will start by creating the "smart stream" class:

import {Readable, ReadableOptions} from 'stream';
import type {S3} from 'aws-sdk';

export class SmartStream extends Readable {
    _currentCursorPosition = 0; // Holds the current starting position for our range queries
    _s3DataRange = 64 * 1024; // Amount of bytes to grab
    _maxContentLength: number; // Total number of bites in the file
    _s3: S3; // AWS.S3 instance
    _s3StreamParams: S3.GetObjectRequest; // Parameters passed into s3.getObject method

    constructor(
        parameters: S3.GetObjectRequest,
        s3: S3,
        maxLength: number,
        nodeReadableStreamOptions?: ReadableOptions
    ) {
        super(nodeReadableStreamOptions);
        this._maxContentLength = maxLength;
        this._s3 = s3;
        this._s3StreamParams = parameters;
    }

    _read() {
        if (this._currentCursorPosition > this._maxContentLength) {
            // If the current position is greater than the amount of bytes in the file
            // We push null into the buffer, NodeJS ReadableStream will see this as the end of file (EOF) and emit the 'end' event
            this.push(null);
        } else {
            // Calculate the range of bytes we want to grab
            const range = this._currentCursorPosition + this._s3DataRange;
            // If the range is greater than the total number of bytes in the file
            // We adjust the range to grab the remaining bytes of data
            const adjustedRange = range < this._maxContentLength ? rangeEnd : this._maxContentLength;
            // Set the Range property on our s3 stream parameters
            this._s3StreamParams.Range = `bytes=${this._currentCursorPosition}-${adjustedRange}`;
            // Update the current range beginning for the next go 
            this._currentCursorPosition = adjustedRange + 1;
            // Grab the range of bytes from the file
            this._s3.getObject(this._s3StreamParams, (error, data) => {
                if (error) {
                    // If we encounter an error grabbing the bytes
                    // We destroy the stream, NodeJS ReadableStream will emit the 'error' event
                    this.destroy(error);
                } else {
                    // We push the data into the stream buffer
                    this.push(data.Body);
                }
            });
        }
    }
}

Now that we have the SmartStream class coded, we are ready to wire it into our program. You can even pipe this stream into a 'gzip' stream for zipped files!

For this next part, as I am assuming you understand the AWS s3 SDK, I am simply going to offer an example of how to establish the stream.

import {SmartStream} from <Path to SmartStream file>;

export async function createAWSStream(): Promise<SmartStream> {
    return new Promise((resolve, reject) => {
        const bucketParams = {
            Bucket: <Your Bucket>,
            Key: <Your Key>
        }

        try {
            const s3 = resolveS3Instance();

            s3.headObject(bucketParams, (error, data) => {
                if (error) {
                    throw error;
                }
                // After getting the data we want from the call to s3.headObject
                // We have everything we need to instantiate our SmartStream class
                const stream = new SmartStream(bucketParams, s3, data.ContentLength);

                resolve(stream);
            });
        } catch (error) {
            reject(error)
        }
    });
}

This is only one example of the amazing things you can do with the NodeJS standard Stream API. For further reading checkout the NodeJS Stream API docs!


This content originally appeared on DEV Community and was authored by Austin Burger


Print Share Comment Cite Upload Translate Updates
APA

Austin Burger | Sciencx (2022-01-02T20:35:31+00:00) Streaming data from AWS S3 using NodeJS Stream API and Typescript. Retrieved from https://www.scien.cx/2022/01/02/streaming-data-from-aws-s3-using-nodejs-stream-api-and-typescript/

MLA
" » Streaming data from AWS S3 using NodeJS Stream API and Typescript." Austin Burger | Sciencx - Sunday January 2, 2022, https://www.scien.cx/2022/01/02/streaming-data-from-aws-s3-using-nodejs-stream-api-and-typescript/
HARVARD
Austin Burger | Sciencx Sunday January 2, 2022 » Streaming data from AWS S3 using NodeJS Stream API and Typescript., viewed ,<https://www.scien.cx/2022/01/02/streaming-data-from-aws-s3-using-nodejs-stream-api-and-typescript/>
VANCOUVER
Austin Burger | Sciencx - » Streaming data from AWS S3 using NodeJS Stream API and Typescript. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2022/01/02/streaming-data-from-aws-s3-using-nodejs-stream-api-and-typescript/
CHICAGO
" » Streaming data from AWS S3 using NodeJS Stream API and Typescript." Austin Burger | Sciencx - Accessed . https://www.scien.cx/2022/01/02/streaming-data-from-aws-s3-using-nodejs-stream-api-and-typescript/
IEEE
" » Streaming data from AWS S3 using NodeJS Stream API and Typescript." Austin Burger | Sciencx [Online]. Available: https://www.scien.cx/2022/01/02/streaming-data-from-aws-s3-using-nodejs-stream-api-and-typescript/. [Accessed: ]
rf:citation
» Streaming data from AWS S3 using NodeJS Stream API and Typescript | Austin Burger | Sciencx | https://www.scien.cx/2022/01/02/streaming-data-from-aws-s3-using-nodejs-stream-api-and-typescript/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.