The CPU heartbeat

These days CPUs run on 2 to 4 GHz frequency processors. It means that a processor gets 2 to 4 times 109 clock signals to do something every second. A processor cannot do any atomic operation faster than this, and also there is no reason to create a clock that is faster than what a processor can follow. It means that a CPU performs a simple operation, such as incrementing a register in half or quarter of a nanosecond. This is the heartbeat of the processor, and if we think of the bureaucrat as humans, who they are, then it is equivalent to one second, approximately, if and as their heartbeat.

Processors have registers and caches on the chip on different levels, L1, L2, and sometimes L3; there is memory, SSD, disk, network, and tapes that may be needed to retrieve data.

Accessing data that is in the L1 cache is approximately 0.5ns. You can grab a paper that is on your desk—half of a second. L2 cache is 7ns. This is a paper in the drawer. You have to push the chair a bit back, bend it in a sitting position, pull out the drawer, take the paper, push the drawer back, and raise and put the paper on the desk; it takes 10 seconds, give or take.

Main memory read is 100ns. The bureaucrat stands up, goes to the shared file at the wall, he waits while other bureaucrats are pulling their papers or putting theirs back, selects the drawer, pulls it out, takes the paper, and walks back to the desk. This is two minutes. This is volatile variable access every time you write a single word on a document and it has to be done twice. Once to read, and once to write, even if you happen to know that the next thing you will do is just fill another field of the form on the same paper.

Modern architectures, where there are no multiple CPUs but rather single CPUs with multiple cores, are a bit faster. One core may check the other core's caches to see if there was any modification on the same variable, but this speeds the volatile access to 20ns or so, which is still a magnitude slower than nonvolatile.

Although the rest is less focused on multithread programming, it is worth mentioning here, because it gives good understanding on the different time magnitudes.

Reading a block from an SSD (4K block usually) is 150,000ns. In human speed, that is a little bit more than 5 days. Reading or sending something to a server over the network on the Gb local Ethernet is 0.5ms, which is like waiting for almost a month for the metaphoric bureaucrat. If the data over the network is on a spinning magnetic disk, then seek time adds up (the time until the disk rotates so that the part of the magnetic surface gets under the reading head) to 20ms. It is, approximately, a year in human terms.

If we send a network packet over the Atlantic on the Internet, it is approximately is 150ms. It is like 14 years, and this was only one single package; if we want to send data over the ocean, it may be seconds that count up to historic times, thousands of years. If we count one minute for a machine to boot, it is equivalent to the time span of our whole civilization.

We should consider these numbers when we want to understand what the CPU is doing most of the time: it waits. Additionally, it also helps cool your nerves when you think about the speed of a real-life bureaucrat. They are not that slow after all, if we consider their heartbeat, which implies the assumption that they have a heart. However, let's go back to real life, CPUs, and L1, L2 caches and volatile variables.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.218.221