This content originally appeared on DEV Community and was authored by Koen Barmentlo
Web applications are often divided into multiple deployment units. Often their called microservices and most of the time they are not really microservices. An architecture that is divided in multiple deployment units is called a distributed architecture. A deployment unit is a self-contained package of software components which can be individually deployed. An example is a single web application with its database. In web applications, deployment units are connected through protocols such as REST, SOAP, Events or others. An architecture which is not is called a monolithic architecture.
Advantages of distributed architectures over monolithic architectures
- Deploy applications over several machines and you'll have much more computing power available.
- When designed well, applications won't go down completely if a part of the application goes down.
- Multiple teams can work more easily individually on a part of the application.
Sounds great right? Well, there are a lot of challenges and disadvantages to face if we choose for a distributed architecture. In the next section I will describe them.
The fallacies of distributed computing
The fallacies of distributed computing are a set of eight false assumptions programmers and architects often make. They're made by L Peter Deutsch and others at Sun Microsystems in 1994. The next section will describe them and provide options to deal with the problems these fallacies might cause.
Fallacy 1: The network is reliable.
If system 2 works perfectly well, but is not accessible for service 1 due to network issues, service 2 is still unavailable. This is why timeouts, service breakers and retry policies exist. A great tool for .NET to handle common network issues is Polly, but even when using a tool like this, the network is still not completely reliable.
Fallacy 2: Latency is zero
If component 1 calls component 2 within a monolith the latency is almost zero. If a network call has to be made, the time it takes for the call to be completed will be much longer. Especially if many network calls have to be made. If the latency of a particular call is 100ms and we'll chain ten calls, latency will add one second to complete the business process.
- Reduce the number of network calls. Instead of sending multiple pieces of data individually, send them in the same request.
- Latency could be reduced by moving the data closer to the client. If the client is in West Europe, make sure your data is in West Europe as well.
- Temporary caching data could reduce the number of network calls, reducing the latency to zero if data has already been fetched. Storing data locally at the client (with an pub/sub model for example) could also be an option to reduce the latency to zero.
Fallacy 3: Bandwidth is infinite
Let's say component 1 fetches 500kb data from service 2. That doesn't sound like much, but if that happens 2.000 times, 1Gb of bandwidth will be used. This is could cause increased latency and bottlenecks. Therefor, monitoring bandwidth is probably a good idea in a distributed architecture. Ways to reduce the amount of bandwidth are:
- Caching
- Storing data locally
- Compression
- GraphQL
- Field selectors
- You could also use lightweight data formats like JSON or a binary serialization format.
Fallacy 4: The network is secure
The attack surface of distributed applications is much bigger than the attack surface of distributed applications. Every single component should be secured because there are many ways you're application can be attacked like XSS, vulnerabilities in operation systems, libraries and DDOS just to name a few.
Fallacy 5: The topology never changes
This fallacy is about every network component like routers, servers, firewalls and proxy servers. The topology changes all the time. Updated network components could make services unavailable, if a component breaks it will be replaced, if a server can't handle the request anymore it could be replaced or a load balancer and an extra server could be added. With modern technology like Kubernetes, Docker and Azure app services for example, virtual machines or containers could even dynamically be added or removed.
- Use host names instead of hard coded IP addresses.
- If that's not enough, use discovery services.
- Service bus frameworks could also help, because every components communicates with the service bus.
- Automate as much as possible so you can replace a server as quickly as possible .
- Monitor all services.
Fallacy 6: There is only one admin
Distributed architectures are complex, especially when they get big. It can't be maintained by a single administrator so it requires a lot of communication between teams to make everything work correctly. This makes decoupling, release management and monitoring extra important.
Fallacy 7: Transport cost is zero
This fallacy is about money. With a distributed architecture you will need extra servers, extra proxies and firewalls etc. which makes a distributed architecture more expensive. If you want to cache data, like discussed in fallacy 2 and 3, we might need extra server memory or a Redis cluster. If we use compression like discussed in fallacy 3 we would need more computing power to compress data. Extra resources are also needed for serializing and deserializing of data. These things might seem cheap, but at large scale it could become very expensive.
Fallacy 8: The network is homogeneous
Networks consist of different components of different vendors which have to be compatible with each other. In a distributed architecture a lot of different combinations of components can be used and not all of them are fully tested. We also don't have control of which browsers and devices connect to you're service. This fallacy isn't all about hardware. It's also about software.
- Try to use open and popular standards like JSON or XML.
- Using PaaS or IaaS providers will take some hardware challenges away.
Other challenges
Monitoring
Finding bugs in a distributed architecture is hard. In a monolithic application there is one log instead of several. Combining all these logs is necessary to trace what happened when an error occurred. There are tools for this, but it's still much more difficult than a single log.
Contract versioning
When multiple components talk to each other they need to understand each other. A data contract is used for this purpose. A data contract describe the messages being sent from one component to another. It consists of which kind of standard is being used (XML or JSON for example), properties, datatype and the structure of data. Contracts can't be changed because it might cause another component to break. Therefor contracts need versioning to be able to migrate to a new version of a contract. Changes in contracts must also be communicated to other development teams and there should be an overview of which deployment unit uses which contract so you know when you can remove an old contract.
Deployment
Many components have to be deployed in a distributed architecture.
- Make sure components are loosely coupled so each can be deployed individually .
- Automate deployments to reduce the amount of work deploying all components .
Distributed transactions
Transactions are easy in monolithic applications. Begin transactions -> do stuff -> commit or rollback transaction. But what if the stuff you want to do requires actions in multiple components? Technologies exist to handle these situations, but that's still much more complex than transaction that are not distributed. Distributed architectures often rely on eventual consistency.
Local development
Local development with a distributed architecture can be done in two different ways. The first is to setup all the components (or only the subset of components required by a developer) of the application. The larger the application gets, the harder and more time consuming this process gets. The other way is to setup an extra environment for development purposes. This environment must be maintained and will come with the cost of extra hardware. Infrastructure as code or scripts make life easier setting up new environments. Debugging is also much harder in a distributed environment.
What kind of application could be suitable for a distributed architecture?
- Applications with a huge code base. I used to work at a company where I had some colleagues who worked on a huge monolithic application which took hours to compile. In this scenario a distributed architecture could be a good idea.
- An application built in a big company with a lot of developers.
- Applications which needs to be very scalable.
- Applications where availability is very important.
In all other scenario's a monolithic architecture is probably the best approach.
I hope that by now you can choose if your next application will be monolithic or distributed, but that doesn't mean you're done yet. There are different types of distributed and monolithic architectures each with their own advantages and disadvantages. It's important to know about them and make a good decision.
Details of these types are out of the scope of this post, so I will only mention some of them.
Monolithic architecture types
- Layered architecture
- Pipeline architecture
- Microkernel architecture
Distributed architecture types
- Service oriented architecture (SOA)
- Microservices
- Serverless
I hope you enjoyed reading this and I love to hear any feedback!
This content originally appeared on DEV Community and was authored by Koen Barmentlo
Koen Barmentlo | Sciencx (2023-05-14T12:29:48+00:00) Do you really need “microservices”?. Retrieved from https://www.scien.cx/2023/05/14/do-you-really-need-microservices/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.