Streamline Your Machine Learning Pipeline: Implement a Batch Predictions API with FastAPI and Excel

Discover how to easily and efficiently make predictions on large datasets using the power of Python’s FastAPI

Introduction

Batch predictions are a powerful tool in Machine Learning projects, allowing you to make predictions on large amounts of data at once. This can be much more efficient and save a lot of time, especially when working with large datasets. In this article, we will discuss the importance of batch predictions, provide an example of how to implement a batch prediction endpoint using Python’s FastAPI, and show how it works by providing a simple example of the endpoint that accepts an excel file and outputs an excel file with predictions added.

Importance of batch predictions

Batch predictions have many benefits in Machine Learning projects, such as saving time and increasing efficiency. For example, instead of making individual predictions one at a time, a batch prediction allows you to make predictions on all of the data at once, thousands of rows of data. This can be especially useful when working with large datasets. In the industry, batch predictions are used in many applications such as natural language processing, image classification, and others.

TLDR; Show Me the Code

In this article, we discussed the importance of batch predictions in Machine Learning projects, and provided an example of how to implement a batch prediction endpoint using Python’s FastAPI. We also discussed other examples of batch prediction endpoints that can be implemented. The example provided accepts an excel file and returns an excel file with predictions added.

You can find the complete code for this example in this Github repository: https://github.com/shanaka-desoysa/fastapi-excel-batch-processor

By implementing a batch prediction endpoint, you can make predictions on large amounts of data at once, saving time and increasing efficiency. With the help of Python’s FastAPI library, it’s easy to create a batch prediction endpoint that can handle large amounts of data, improve efficiency, and keep your server secure.

Batch Prediction APIs in ML

There are many Machine Learning projects that implement batch prediction endpoints, as batch predictions are a common requirement in many ML applications. Some examples include:

TensorFlow Serving: an open-source framework for serving Machine Learning models. It allows you to deploy your models and make predictions in a batch mode.
MLflow: an open-source platform to manage the ML lifecycle, including tracking experiment runs between multiple users within a reproducible environment, packaging ML code in a reusable, reproducible form in addition to support for deploying models.
AWS SageMaker: a fully managed service that enables developers and data scientists to build, train, and deploy Machine Learning models quickly, also provide batch predictions.
Microsoft Azure Machine Learning: a cloud-based platform that allows you to build, deploy, and manage Machine Learning models, also allows you to make batch predictions using the platform’s REST API.
Hugging Face’s Transformers: an open-source library that provides pre-trained models and allows to make batch predictions on input data.

Implementing a batch prediction endpoint with FastAPI

In this section, we will show how to use Python’s FastAPI library to create a batch prediction endpoint on your own project easily. The endpoint will accept an excel file as input and output an excel file with predictions added. The code below shows an example of a simple endpoint that accepts an excel file, reads it into a Pandas DataFrame, and applies a simple model to make predictions. The modified DataFrame is then written to a temporary file and returned as an Excel file response.

import io
import tempfile
import logging
from fastapi import FastAPI, File, HTTPException
from fastapi.responses import FileResponse
from fastapi.staticfiles import StaticFiles
import pandas as pd

logger = logging.getLogger(__name__)


app = FastAPI()
app.mount("/static", StaticFiles(directory="static"), name="static")


def predict(feature_1, feature_2, feature_3):
    # Simple model: predict 1 if feature_1 + feature_2 + feature_3 > 3, else predict 0
    return 1 if feature_1 + feature_2 + feature_3 > 3 else 0


@app.post("/predict_batch/")
async def predict_batch(file: bytes = File(...)):
    """
    Expects an excel file with columns: feature_1, feature_2, feature_3.
    <a href="/static/example_input.xlsx">Example Input</a>.<br>
    Returns a modified excel file with columns: feature_1, feature_2, feature_3, prediction
    <a href="/static/example_output.xlsx">Example Output</a>
    """
    try:
        # Read the Excel file into a Pandas DataFrame
        df = pd.read_excel(io.BytesIO(file))

        # Make predictions using the predict function and the apply method
        df['prediction'] = df[['feature_1', 'feature_2', 'feature_3']].apply(
            lambda row: predict(row['feature_1'], row['feature_2'], row['feature_3']), axis=1)

        # Write the modified DataFrame to a temporary file and return it as an Excel file response
        stream = io.BytesIO()
        df.to_excel(stream, index=False)
        stream.seek(0)
        with tempfile.NamedTemporaryFile(mode="w+b", suffix=".xlsx", delete=False) as FOUT:
            FOUT.write(stream.read())
            return FileResponse(
                FOUT.name,
                media_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
                headers={
                    "Content-Disposition": "attachment; filename=predictions.xlsx",
                    "Access-Control-Expose-Headers": "Content-Disposition",
                }
            )
    except Exception as e:
        logger.error(e)
        raise HTTPException(status_code=400, detail="Batch prediction failed.")

The endpoint expects an excel file with columns: feature_1, feature_2, feature_3. An example input file can be found at the link “/static/example_input.xlsx” and the endpoint will return a modified excel file with columns: feature_1, feature_2, feature_3, and prediction. An example output file can be found at the link “/static/example_output.xlsx”.

Example Input File

Example Output File

Conclusion

In conclusion, batch predictions are an essential tool in Machine Learning projects that can save time and improve efficiency when working with large datasets. We demonstrated how to implement a batch prediction endpoint using Python’s FastAPI, and provided an example of how it can be used to make predictions on an excel file. Additionally, we discussed other examples of Machine Learning projects and frameworks that provide batch prediction endpoints, such as Tensorflow Serving, MLFlow, and SageMaker. These endpoints can handle large amounts of data, improve efficiency, and keep your server secure. By implementing a batch prediction endpoint, you can streamline your Machine Learning pipeline and make predictions on large amounts of data with ease.

Stay informed and stay ahead of the game by following our channel for the latest insights and tutorials on ML, DS, coding, and hacking!

Level Up Coding

Thanks for being a part of our community! Before you go:

👏 Clap for the story and follow the author 👉
📰 View more content in the Level Up Coding publication
🔔 Follow us: Twitter | LinkedIn | Newsletter

🚀👉 Join the Level Up talent collective and find an amazing job

Streamline Your Machine Learning Pipeline: Implement a Batch Predictions API with FastAPI and Excel was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.

Post date January 30, 2023
Post categories In api, data-science, fastapi, machine-learning, rest-api

This content originally appeared on Level Up Coding - Medium and was authored by Shanaka DeSoysa

Discover how to easily and efficiently make predictions on large datasets using the power of Python’s FastAPI

Photo by Mika Baumeister on Unsplash

Introduction

Importance of batch predictions

TLDR; Show Me the Code

You can find the complete code for this example in this Github repository: https://github.com/shanaka-desoysa/fastapi-excel-batch-processor

Batch Prediction APIs in ML

There are many Machine Learning projects that implement batch prediction endpoints, as batch predictions are a common requirement in many ML applications. Some examples include:

TensorFlow Serving: an open-source framework for serving Machine Learning models. It allows you to deploy your models and make predictions in a batch mode.
MLflow: an open-source platform to manage the ML lifecycle, including tracking experiment runs between multiple users within a reproducible environment, packaging ML code in a reusable, reproducible form in addition to support for deploying models.
AWS SageMaker: a fully managed service that enables developers and data scientists to build, train, and deploy Machine Learning models quickly, also provide batch predictions.
Microsoft Azure Machine Learning: a cloud-based platform that allows you to build, deploy, and manage Machine Learning models, also allows you to make batch predictions using the platform’s REST API.
Hugging Face’s Transformers: an open-source library that provides pre-trained models and allows to make batch predictions on input data.

Implementing a batch prediction endpoint with FastAPI

import io
import tempfile
import logging
from fastapi import FastAPI, File, HTTPException
from fastapi.responses import FileResponse
from fastapi.staticfiles import StaticFiles
import pandas as pd

logger = logging.getLogger(__name__)


app = FastAPI()
app.mount("/static", StaticFiles(directory="static"), name="static")


def predict(feature_1, feature_2, feature_3):
    # Simple model: predict 1 if feature_1 + feature_2 + feature_3 > 3, else predict 0
    return 1 if feature_1 + feature_2 + feature_3 > 3 else 0


@app.post("/predict_batch/")
async def predict_batch(file: bytes = File(...)):
    """
    Expects an excel file with columns: feature_1, feature_2, feature_3.
    <a href="/static/example_input.xlsx">Example Input</a>.<br>
    Returns a modified excel file with columns: feature_1, feature_2, feature_3, prediction
    <a href="/static/example_output.xlsx">Example Output</a>
    """
    try:
        # Read the Excel file into a Pandas DataFrame
        df = pd.read_excel(io.BytesIO(file))

        # Make predictions using the predict function and the apply method
        df['prediction'] = df[['feature_1', 'feature_2', 'feature_3']].apply(
            lambda row: predict(row['feature_1'], row['feature_2'], row['feature_3']), axis=1)

        # Write the modified DataFrame to a temporary file and return it as an Excel file response
        stream = io.BytesIO()
        df.to_excel(stream, index=False)
        stream.seek(0)
        with tempfile.NamedTemporaryFile(mode="w+b", suffix=".xlsx", delete=False) as FOUT:
            FOUT.write(stream.read())
            return FileResponse(
                FOUT.name,
                media_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
                headers={
                    "Content-Disposition": "attachment; filename=predictions.xlsx",
                    "Access-Control-Expose-Headers": "Content-Disposition",
                }
            )
    except Exception as e:
        logger.error(e)
        raise HTTPException(status_code=400, detail="Batch prediction failed.")

Example Input File

Example Output File

Conclusion

Stay informed and stay ahead of the game by following our channel for the latest insights and tutorials on ML, DS, coding, and hacking!

Level Up Coding

Thanks for being a part of our community! Before you go:

👏 Clap for the story and follow the author 👉
📰 View more content in the Level Up Coding publication
🔔 Follow us: Twitter | LinkedIn | Newsletter

🚀👉 Join the Level Up talent collective and find an amazing job

This content originally appeared on Level Up Coding - Medium and was authored by Shanaka DeSoysa

Print Share Comment Cite Upload Translate Updates

APA

Shanaka DeSoysa | Sciencx (2023-01-30T17:28:27+00:00) Streamline Your Machine Learning Pipeline: Implement a Batch Predictions API with FastAPI and Excel. Retrieved from https://www.scien.cx/2023/01/30/streamline-your-machine-learning-pipeline-implement-a-batch-predictions-api-with-fastapi-and-excel/

MLA

" » Streamline Your Machine Learning Pipeline: Implement a Batch Predictions API with FastAPI and Excel." Shanaka DeSoysa | Sciencx - Monday January 30, 2023, https://www.scien.cx/2023/01/30/streamline-your-machine-learning-pipeline-implement-a-batch-predictions-api-with-fastapi-and-excel/

HARVARD

Shanaka DeSoysa | Sciencx Monday January 30, 2023 » Streamline Your Machine Learning Pipeline: Implement a Batch Predictions API with FastAPI and Excel., viewed ,<https://www.scien.cx/2023/01/30/streamline-your-machine-learning-pipeline-implement-a-batch-predictions-api-with-fastapi-and-excel/>

VANCOUVER

Shanaka DeSoysa | Sciencx - » Streamline Your Machine Learning Pipeline: Implement a Batch Predictions API with FastAPI and Excel. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2023/01/30/streamline-your-machine-learning-pipeline-implement-a-batch-predictions-api-with-fastapi-and-excel/

CHICAGO

" » Streamline Your Machine Learning Pipeline: Implement a Batch Predictions API with FastAPI and Excel." Shanaka DeSoysa | Sciencx - Accessed . https://www.scien.cx/2023/01/30/streamline-your-machine-learning-pipeline-implement-a-batch-predictions-api-with-fastapi-and-excel/

IEEE

" » Streamline Your Machine Learning Pipeline: Implement a Batch Predictions API with FastAPI and Excel." Shanaka DeSoysa | Sciencx [Online]. Available: https://www.scien.cx/2023/01/30/streamline-your-machine-learning-pipeline-implement-a-batch-predictions-api-with-fastapi-and-excel/. [Accessed: ]

rf:citation

» Streamline Your Machine Learning Pipeline: Implement a Batch Predictions API with FastAPI and Excel | Shanaka DeSoysa | Sciencx | https://www.scien.cx/2023/01/30/streamline-your-machine-learning-pipeline-implement-a-batch-predictions-api-with-fastapi-and-excel/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.

Discover how to easily and efficiently make predictions on large datasets using the power of Python’s FastAPI

Introduction

Importance of batch predictions

TLDR; Show Me the Code

Batch Prediction APIs in ML

Implementing a batch prediction endpoint with FastAPI

Conclusion

Level Up Coding

Related Posts