System Design Basics: Distributed Systems

connected network

A Distributed System is a system in which components are located on different networked servers and coordinate their actions by passing data between each other.

Key characteristics of a distributed system

These are characteristics that you might want a system to have. Depending on the design decisions made, some of these may be traded off in favor of another characteristic.

Scalability

Scalability relates to how well a system can handle growth, or increased demand. A system that is able to adapt and grow to support increases in data volume or transaction volume without impacting performance is considered scalable.

There are two ways in which we can scale a system:

Vertical Scaling: We can increase the size of an individual server (add more power) to be able to process more work.

vertical scaling of a server
Vertical Scaling

Vertical scaling is limited to the capacity that a server has to increase power. Additionally, scaling the same machine can put a system at risk of a single point of failure. If the server goes down, everything is down.

Horizontal Scaling: We can add more servers to a network and distribute the load amongst them in order to handle more work.

horizontal scaling of servers
Horizontal Scaling

Horizontal scaling can offer limitless scaling. You add more servers to your network as needed. Additionally, it doesn’t suffer from single point of failure like a vertically scaled server would because the load is distributed. If one server is down, other parts of the system are still functioning.

Reliability

Reliability is the probability a system performing its intended function without failure under normal conditions for a given period of time. A reliable system will still deliver its services even when it experiences one or more failures.

Reliability can be achieved through redundancy of both software and hardware components, so if something goes down, a back up is ready to take its place. Redundancy is the duplication of critical components or functions of a system with the intention of increasing the reliability of the system, usually in the form of a backup or fail-safe, or to improve actual system performance.

Availability

Availability is the probability a system is operational at a given point in time, under normal conditions, or how resistant a system is to failures. This is often described as a system’s fault tolerance.

If a system is up and operational for 3/4th of a year, then that system has 75% availability.

If a system is reliable, it is available. A system that is available isn’t necessarily reliable. In order to achieve this, a system should be available in a variety of conditions. If a system is only available during certain conditions, it will require down time when it’s hit with unanticipated volume, rendering it unreliable.

An example is a website selling fan merchandise that under normal conditions, is always available for customers to visit and purchase items. If the website releases a very popular item and receives far more traffic than the system was set up to handle, the system will go down and become unavailable to its customers.

Availability is usually represented by number of nines. Nines refer to the percentage of uptime of a service. Ex. 4 nines means an uptime of 99.99%. 5nines is an uptime of 99.999%. A system with 5 nines or more is considered a highly available system.

Latency and Throughput

Latency and throughput are the two measures of the performance of a system.

Latency: How long it takes for data to traverse a system. How long it takes for data to get from point A in a system to point B.

Throughput: The number of operations a system can handle per time unit. Throughput can often be measured by transactions per second (TPS) or requests per second (RPS).

Maintainability

The probability that a system or system element can be repaired in a defined environment within a specified period of time. Increased maintainability implies shorter repair times. This encompasses how simple and fast it is to get back to full operations when a failure occurs. If the time it takes to repair a system is low, the more available a system is.


System Design Basics: Distributed Systems was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.


This content originally appeared on Level Up Coding - Medium and was authored by Mariam Jaludi

connected network

A Distributed System is a system in which components are located on different networked servers and coordinate their actions by passing data between each other.

Key characteristics of a distributed system

These are characteristics that you might want a system to have. Depending on the design decisions made, some of these may be traded off in favor of another characteristic.

Scalability

Scalability relates to how well a system can handle growth, or increased demand. A system that is able to adapt and grow to support increases in data volume or transaction volume without impacting performance is considered scalable.

There are two ways in which we can scale a system:

Vertical Scaling: We can increase the size of an individual server (add more power) to be able to process more work.

vertical scaling of a server
Vertical Scaling

Vertical scaling is limited to the capacity that a server has to increase power. Additionally, scaling the same machine can put a system at risk of a single point of failure. If the server goes down, everything is down.

Horizontal Scaling: We can add more servers to a network and distribute the load amongst them in order to handle more work.

horizontal scaling of servers
Horizontal Scaling

Horizontal scaling can offer limitless scaling. You add more servers to your network as needed. Additionally, it doesn’t suffer from single point of failure like a vertically scaled server would because the load is distributed. If one server is down, other parts of the system are still functioning.

Reliability

Reliability is the probability a system performing its intended function without failure under normal conditions for a given period of time. A reliable system will still deliver its services even when it experiences one or more failures.

Reliability can be achieved through redundancy of both software and hardware components, so if something goes down, a back up is ready to take its place. Redundancy is the duplication of critical components or functions of a system with the intention of increasing the reliability of the system, usually in the form of a backup or fail-safe, or to improve actual system performance.

Availability

Availability is the probability a system is operational at a given point in time, under normal conditions, or how resistant a system is to failures. This is often described as a system’s fault tolerance.

If a system is up and operational for 3/4th of a year, then that system has 75% availability.

If a system is reliable, it is available. A system that is available isn’t necessarily reliable. In order to achieve this, a system should be available in a variety of conditions. If a system is only available during certain conditions, it will require down time when it's hit with unanticipated volume, rendering it unreliable.

An example is a website selling fan merchandise that under normal conditions, is always available for customers to visit and purchase items. If the website releases a very popular item and receives far more traffic than the system was set up to handle, the system will go down and become unavailable to its customers.

Availability is usually represented by number of nines. Nines refer to the percentage of uptime of a service. Ex. 4 nines means an uptime of 99.99%. 5nines is an uptime of 99.999%. A system with 5 nines or more is considered a highly available system.

Latency and Throughput

Latency and throughput are the two measures of the performance of a system.

Latency: How long it takes for data to traverse a system. How long it takes for data to get from point A in a system to point B.

Throughput: The number of operations a system can handle per time unit. Throughput can often be measured by transactions per second (TPS) or requests per second (RPS).

Maintainability

The probability that a system or system element can be repaired in a defined environment within a specified period of time. Increased maintainability implies shorter repair times. This encompasses how simple and fast it is to get back to full operations when a failure occurs. If the time it takes to repair a system is low, the more available a system is.


System Design Basics: Distributed Systems was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.


This content originally appeared on Level Up Coding - Medium and was authored by Mariam Jaludi


Print Share Comment Cite Upload Translate Updates
APA

Mariam Jaludi | Sciencx (2022-04-06T12:50:57+00:00) System Design Basics: Distributed Systems. Retrieved from https://www.scien.cx/2022/04/06/system-design-basics-distributed-systems/

MLA
" » System Design Basics: Distributed Systems." Mariam Jaludi | Sciencx - Wednesday April 6, 2022, https://www.scien.cx/2022/04/06/system-design-basics-distributed-systems/
HARVARD
Mariam Jaludi | Sciencx Wednesday April 6, 2022 » System Design Basics: Distributed Systems., viewed ,<https://www.scien.cx/2022/04/06/system-design-basics-distributed-systems/>
VANCOUVER
Mariam Jaludi | Sciencx - » System Design Basics: Distributed Systems. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2022/04/06/system-design-basics-distributed-systems/
CHICAGO
" » System Design Basics: Distributed Systems." Mariam Jaludi | Sciencx - Accessed . https://www.scien.cx/2022/04/06/system-design-basics-distributed-systems/
IEEE
" » System Design Basics: Distributed Systems." Mariam Jaludi | Sciencx [Online]. Available: https://www.scien.cx/2022/04/06/system-design-basics-distributed-systems/. [Accessed: ]
rf:citation
» System Design Basics: Distributed Systems | Mariam Jaludi | Sciencx | https://www.scien.cx/2022/04/06/system-design-basics-distributed-systems/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.