This content originally appeared on Bits and Pieces - Medium and was authored by Tam Nguyen
In the TV show, ‘Silicon Valley’, the character Gavin Belson said “small is the new big” — he wasn’t wrong.
With the endless stream of information we encounter daily, it getting harder to pay attention to anything. URL shorteners such as bit.ly are widely popular because they help to shorten arbitrary URLs like “https://www.google.com/search?q=fasdfsa&sxsrfAOaemvLDZGX" into something a lot shorter like “https://www.bit.ly/dut122"
Requirements
To design a system correctly, we need to ask the right questions, such as:
- What is the traffic volume?
- What characters are allowed in the shortened URL?
- How long should the shortened URLs should be?
To design a scalable system, you should design a system which can handle up to millions of URLs generated per day. The length of the generated URLs should be as short as possible. They can be a combination of characters (a-z, A-Z) as well as numbers (0–9).
Use case
- User input a long URL, the service returns a much shorter URL
- User clicks the shortened URL, they are redirected to the original URL
- The service should be highly available, scalable, fault tolerant.
Quick estimations
- Write: 100 million URLs shortened per day.
- Write Per Second: 100 million / 24/ 3600 = 1160.
- Read: Because it is hard to estimate how many times a users will need to use the URL, we can assume the ratio to be 10:1, read to write. Therefore, it will be 1160 * 10 = 11,600.
- When you build something, it should be durable. So we can expect the service will up and running for 15 years. This means we have to support 100 million * 365 * 15 = 548 billions records.
- An average URL length will be 100. Then the storage requires over 15 years will be 548 billions * 100 bytes = 54.8 TB.
High-level design
The high-level design of this system will focus on the API endpoints.
A URL shortener will require two primary API endpoints. The first one will be a POST request to create new short URL. The second one will be a GET request for URL redirecting.
- POST api/v1/url/shorten — @request parameter: longURL, @return shortURL
- GET api/v1/shorturl — @return longURL for HTTP connection
To redirect a URL, once a user send the above GET request, the service will change the short URL to the long URL with 301 redirect.
Low-level design
- User inputs a longURL.
- The service will check if the longURL already existed in the database.
- If it does, fetch the shortened URL from the database and return it.
- If it doesn’t, create a new shortURL.
- Store all the attributes in the database with ID, shortURL, and longURL.
Data model
Although there are many ways in which you can store the data for this service, I believe that the model in the image below is the best option:
It is a simple model, but is enough for the purpose of this post.
How to generate the unique shortened link
While a hash function could be the answer here, in this post I will use a “Base-62 Conversion”. Although each mechanism has its own pros and cons, using a hash function can result in collisions between shortened URLs. While this can be avoided, it can be expensive to calculate.
Our aim is to design a scalable system, so we want to use the most sustainable solution for our system. We can’t risk making the service doing extra work while we have the luxury of designing it differently. Furthermore, base conversion is already a popular algorithm used for URL shorteners.
For those that aren’t familiar with base conversion, it is an algorithm that helps convert the same number between its different number representation systems. In this case, the ‘base-62 conversion’ is used as if there are 62 possible characters for the shortened URL value.
For example, the number of 649 in the long URL will be converted to “AT” in the base-62. Therefore, our shortened URL will look like this “bit.ly/AT”. Hope things are clearly for everyone.
So the workflow on the service now will be like this:
- A user enters a long URL.
- The service will check if the long URL already exists in the database or not. If it does, the shortened URL will be returned.
- If not, the database will generate a random ID for the URL *.
- We converted the newly generated ID with the base-62 to create a new unique string. Then, we concat that tiny unique string with our domain to become a short URL.
- We store everything in the database according to the model in the image above.
*This will not be a problem with you only using one server for the service, which is unlikely. In case the service uses a distributed system, we have to create a ID generator for the system because it can be the case where 2 exactly ID being generated. But in the scope of this post, I will not talk about that. It is for another post.
There is also one small notice here as we design this system. Because there are more reads than writes, to boost the performance, we will store the mapping of <shortURL, longURL> in a cache.
So let’s adjust our workflow slightly:
- The user will click the shortened URL link.
- The load balancer will send the request to the web servers.
- If the shortened URL is already in the cache, return the longURL immediately.
- Bad cases will happen if the shortened URL is not in the cache, the service will have to go find it in the database. Worst case scenarios is the user enters an invalid URL.
- The longURL will be returned to the user.
Conclusion
Well, this post has been long enough and I think all the necessities of designing a URL shortener have been addressed. If you have any questions, do not hesitate to contact me at hi@tamnguyen.io.
Thank you everyone and stay safe.
Subscribe to my website at tamnguyen.io for more posts on system design.
Build Micro Frontends with components
Microfrontends are a great way to speed up and scale app development, with independent deployments, decoupled codebases, and autonomous teams.
Bit offers a great developer experience for building component-driven Micro frontends. Build components, collaborate, and compose applications that scale. Our GitHub has over 14.5k stars!
Learn more
- Building a React Component Library — The Right Way
- Microservices are Dead — Long Live Miniservices
- 7 Tools for Faster Frontend Development in 2022
How to Design a Scalable URL Shortener System was originally published in Bits and Pieces on Medium, where people are continuing the conversation by highlighting and responding to this story.
This content originally appeared on Bits and Pieces - Medium and was authored by Tam Nguyen
Tam Nguyen | Sciencx (2021-12-29T11:10:45+00:00) How to Design a Scalable URL Shortener System. Retrieved from https://www.scien.cx/2021/12/29/how-to-design-a-scalable-url-shortener-system/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.