This content originally appeared on DEV Community and was authored by DEV Community
Return to Well-Architected Framework Guide
How do you back up data?
- Identify and back up all data that needs to be backed up, or reproduce the data from sources
- Secure and encrypt backups
- Perform data backup automatically
- Perform periodic recovery of the data to verify backup integrity and processes
How do you use fault isolation to protect your workload?
- Deploy the workload to multiple locations
- Automate recovery for components constrained to a single location
- Use bulkhead architectures to limit scope of impact
How do you design your workload to withstand component failures?
- Monitor all components of the workload to detect failures
- Fail over to healthy resources
- Automate healing on all layers:
- Use static stability to prevent bimodal behavior
- Send notifications when events impact availability
How do you test reliability?
- Use playbooks to investigate failures
- Perform post-incident analysis
- Test functional requirements
- Test scaling and performance requirements
- Test resiliency using chaos engineering
- Conduct game days regularly
How do you plan for disaster recovery (DR)?
- Define recovery objectives for downtime and data loss
- Use defined recovery strategies to meet the recovery objectives
- Test disaster recovery implementation to validate the implementation
- Manage configuration drift at the DR site or region
- Automate recovery
Return to Well-Architected Framework Guide
This content originally appeared on DEV Community and was authored by DEV Community
DEV Community | Sciencx (2022-03-07T16:26:58+00:00) Appendix: Reliability (Failure Management) – AWS Well-Architected Framework Study Guide. Retrieved from https://www.scien.cx/2022/03/07/appendix-reliability-failure-management-aws-well-architected-framework-study-guide/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.