Capacity estimation

Our system would be read-heavy; there would be lots of redirection requests compared to new URL shortenings. Let's assume a 100:1 ratio between read and write.

  • Traffic estimates: If we assume that we would have 500 Million new URL shortenings per month, we can expect (100 * 500 Million => 50 Billion) redirections during the same time. What would be Queries Per Second (QPS) for our system?

New URL shortenings per second:

500 million / (30 days * 24 hours * 3600 seconds) ~= 200 URLs/s

URL redirections per second:

50 billion / (30 days * 24 hours * 3600 sec) ~= 19K/s
  • Storage estimates: Since we expect to have 500M new URLs every month, if we were to keep these objects for five years, the total number of objects we would be storing would be 30 billion:
500 million * 5 years * 12 months = 30 billion

Let's assume that each object we are storing can be of 500 bytes (just a ballpark figure, we will dig into it later); we would need 15 TB of total storage:

30 billion * 500 bytes = 15 TB
  • Bandwidth estimates: For write requests, since every second we expect 200 new URLs, the total incoming data for our service would be 100 KB per second:
200 * 500 bytes = 100 KB/s

For read requests, since every second we expect ~19K URLs redirections, the total outgoing data for our service would be 9 MB per second:

19K * 500 bytes ~= 9 MB/s
  • Memory estimates: If we want to cache some of the hot URLs that are frequently accessed, how much memory would we need to store them? If we follow the 80-20 rule, meaning 20% of URLs generating 80% of traffic, we would like to cache this 20% of hot URLs.

Since we have 19K requests per second, we would be getting 1.7 billion requests per day:

19K * 3600 seconds * 24 hours ~= 1.7 billion

To cache 20% of these requests, we would need 170 GB of memory:

0.2 * 1.7 billion * 500 bytes ~= 170 GB
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.180.113