AWS Sagemaker Best Practices Packt Book Review

This content originally appeared on DEV Community and was authored by Salah Elhossiny

4 weeks ago, I was invited from Packt publishing team to review on the awesome books about AWS Sagemaker best practices as I am AWS ML hero.

Book chapters:

1. Sagemaker Overview

This chapter provides an overview of whole ML pipeline: preparing, building, training, tuning, deployment and monitoring. It also provides a data preparation capabilities, such as:

It also provides a feature tour about model building capabilities, such as:

It also provides an overview for training and tuning
capabilities, such as:

ML phases on Sagemaker:

For data preparation
For operations

For model training

2. Data Science Environments

This chapter provides an overview of how to create managed data science environments to scale and create repeatable environments for your model-building activities using IaC or CaC .

It also provides an overview for Cloudformation importance and capabilities:

Consistency
Improved management
Reducing manual approvals
Reducing hand-offs in siloed teams
Providing centralized governance
By ensuring environments are provisioned across teams using only approved configurations.
AWS Service Catalog and its importance in such approach.

You will get a brief overview of the machine learning (ML) use case. ML use case mentioned is prediction of a value for a particular type of air quality measurement (for example, pm25) given location (weather station) and date, that's can be treated using regression XGBoost model.

3. Data Labeling with Amazon SageMaker Ground Truth

This chapter provides an review of labeling data using SageMaker Ground Truth and its challenges, such as:

Challenges with labeling data at scale
Addressing unique labeling requirements with custom labeling workflows
Using active learning to reduce labeling time
Security and permissions ( private workforces )

4. Data Preparation at Scale Using AWS SageMaker Data Wrangler and Processing

In this chapter, the following topics are covered:

Visual data preparation with Data Wrangler
Bias detection and explainability with Data Wrangler
Data preparation at scale with SageMaker Processing
Difference between Data Wrangler and Spark(in EMR) according to dataset size.

My personal feedback on this chapter is that it needs more clarification and examples.

5. Centralized Feature Repository with AWS SageMaker Feature Store

This chapter, we are going to cover the following main topics:

Basic concepts of SageMaker Feature Store.
Creating reusable features to reduce feature inconsistencies and inference latency.
Designing solutions for near real-time ML predictions.

6. Training and Tuning at Scale

This chapter covers the following topics:

ML training at scale with SageMaker distributed libraries.
Difference between data parallelism and model parallelism.
Some considerations to be taken before choosing between data & model parallelism.

Automated model tuning with SageMaker hyperparameter tuning.
Organizing and tracking training jobs with SageMaker Experiments.
Best practices to consider while configuring hyperparameter jobs on Amazon SageMaker.

7. Profile Training Jobs with Amazon SageMaker Debugger

This chapter covers the following main topics:

Amazon SageMaker Debugger essentials
Real-time monitoring of training jobs using built-in and custom rules
Gain insight into the training infrastructure and training framework by:
- Analyzing and visualizing the system and framework metrics generated by the profiler.
- Analyzing the profiler report generated by SageMaker Debugger.

8. Managing Models at Scale Using a Model Registry

A model registry is a central repository for metadata related to a certain model version. It contains details about how the model was created, how it performed, and where and how it was deployed.

Additional features, such as approval workflows and notifications, are frequently included in model registry services or solutions.

This chapter covers the following topics:

Using a model registry
Choosing a model registry solution
Managing models using the Amazon SageMaker model registry

9. Updating Production Models Using SageMaker Endpoint Production Variants

This chapter covers the following main topics:

Basic concepts of Amazon SageMaker Endpoint Production Variants
Deployment strategies for updating ML models with Amazon SageMaker Endpoint Production Variants
Selecting an appropriate deployment strategy

10. Optimizing Model Hosting and Inference Costs

This chapter covers the following topics:

Real-time inference versus batch inference
Deploying multiple models behind a single inference endpoint
Scaling inference endpoints to meet inference traffic demands
Using Elastic Inference for deep learning models
Optimizing models with SageMaker Neo

11. Monitoring Production Models with Amazon SageMaker Model Monitor and Clarify

This chapter covers the following main topics:

Basic concepts of Amazon SageMaker Model Monitor and Amazon
SageMaker Clarify
End-to-end architectures for monitoring ML models
Best practices for monitoring ML models

12. Machine Learning Automated Workflows

This chapter covers the following topics:

Considerations for automating your SageMaker ML workflows
Building ML workflows with Amazon SageMaker Pipelines
Creating CI/CD ML pipelines using Amazon SageMaker projects

13. Well-Architected Machine Learning with Amazon SageMaker

This chapter covers the following main topics:

Best practices for operationalizing ML workloads
Best practices for securing ML workloads
Best practices for building reliable ML workloads
Best practices for building performant ML workloads
Best practices for building cost-optimized ML workloads

14. Managing SageMaker Features across Accounts

This chapter discusses the following topics as they relate to managing SageMaker features across multiple AWS accounts:

Examining an overview of the AWS multi-account environment
Understanding the benefits of using multiple AWS accounts with Amazon SageMaker
Examining multi-account considerations with Amazon SageMaker

Conclusion

I really enjoyed reading this article and my personal rate for it is 4.5 / 5, because it gives a very good overview for best practices for implementing ML on AWS Sagemaker.

Happy reading :)

This content originally appeared on DEV Community and was authored by Salah Elhossiny

Print Share Comment Cite Upload Translate Updates

APA

Salah Elhossiny | Sciencx (2021-12-03T10:09:01+00:00) AWS Sagemaker Best Practices Packt Book Review. Retrieved from https://www.scien.cx/2021/12/03/aws-sagemaker-best-practices-packt-book-review/

MLA

" » AWS Sagemaker Best Practices Packt Book Review." Salah Elhossiny | Sciencx - Friday December 3, 2021, https://www.scien.cx/2021/12/03/aws-sagemaker-best-practices-packt-book-review/

HARVARD

Salah Elhossiny | Sciencx Friday December 3, 2021 » AWS Sagemaker Best Practices Packt Book Review., viewed ,<https://www.scien.cx/2021/12/03/aws-sagemaker-best-practices-packt-book-review/>

VANCOUVER

Salah Elhossiny | Sciencx - » AWS Sagemaker Best Practices Packt Book Review. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2021/12/03/aws-sagemaker-best-practices-packt-book-review/

CHICAGO

" » AWS Sagemaker Best Practices Packt Book Review." Salah Elhossiny | Sciencx - Accessed . https://www.scien.cx/2021/12/03/aws-sagemaker-best-practices-packt-book-review/

IEEE

" » AWS Sagemaker Best Practices Packt Book Review." Salah Elhossiny | Sciencx [Online]. Available: https://www.scien.cx/2021/12/03/aws-sagemaker-best-practices-packt-book-review/. [Accessed: ]

rf:citation

» AWS Sagemaker Best Practices Packt Book Review | Salah Elhossiny | Sciencx | https://www.scien.cx/2021/12/03/aws-sagemaker-best-practices-packt-book-review/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.

Book chapters:

1. Sagemaker Overview

2. Data Science Environments

3. Data Labeling with Amazon SageMaker Ground Truth

4. Data Preparation at Scale Using AWS SageMaker Data Wrangler and Processing

5. Centralized Feature Repository with AWS SageMaker Feature Store

6. Training and Tuning at Scale

7. Profile Training Jobs with Amazon SageMaker Debugger

8. Managing Models at Scale Using a Model Registry

9. Updating Production Models Using SageMaker Endpoint Production Variants

10. Optimizing Model Hosting and Inference Costs

11. Monitoring Production Models with Amazon SageMaker Model Monitor and Clarify

12. Machine Learning Automated Workflows

13. Well-Architected Machine Learning with Amazon SageMaker

14. Managing SageMaker Features across Accounts

Conclusion

Related Posts