Independent Functions or How to Create the Worst Random Number Generator

Andrey Komarov
Article by Andrey Komarov
Friday, May 31st, 2019
Let’s talk about random numbers. As we know, among other things, they are used in cryptography. This post is written by our friends from the ‘Mind vs Trash’ VK-community, and it will show you how easy it is to spoil things when it comes to randomness.

Nowadays, most of the cryptographic operations use computers. Since a computer is a deterministic device, it isn’t able to simply generate a truly random number. In this case, one may use various sources of entropy form the outer world, such as thermometer or voltmeter readings. Those numbers can be considered fairly random.

The solution mentioned above still has a problem: it’s impossible to get a lot of samples very fast, while sometimes that’s exactly what is needed. In order to solve this problem, pseudo-random number generators were invented. They generate several genuinely random numbers which become seeds for generating as many pseudo-random numbers as needed.

However, this solution has its weaknesses as well. If a generator turns out to be predictable, malicious actors will be able to break the encryption. We need to prevent that from happening! Thinking the same way, Intel engineers added the RDRAND to the x86 instruction set. According to Intel, the instruction allows quickly generating random data based on some hardware signals. Tempting, right?

Unsurprisingly, the Linux kernel developers took it with a grain of salt but, nevertheless, considered the approach. In a similar way, Linux allows getting random numbers by reading the file /dev/urandom. Although one may suggest using /dev/random for true randomness, there exist some concerns even about the safety of /dev/random.

So, back to /dev/urandom. How does the data get in it? One can use the following trick. Let we have several independent sequences of random numbers. Then, if any of them is ‘good,’ a sequence obtained as a result of XOR of all of them will be ‘good’ as well. Perfect. Intel gave us a new opportunity to generate random numbers, so let’s take a sequence of pseudorandom numbers and XOR it with an output of RDRAND and use the resulting sequence as an output of /dev/urandom. If RDRAND is a good generator, then the /dev/urandom also becomes good. If RDRAND is not good, nothing gets worse. In any case, we win, thank you Intel!

But all of that is not true. The old version of /dev/random is a random number generator, but there’s one problem. The trick described earlier requires that the data comes from independent sources. One may argue: they must be independent because RDRAND isn’t used in generating a pseudorandom sequence for /dev/urandom and vice versa. But what if Intel lies us about the randomness of the data that RDRAND outputs, and a processor has been designed in a particular way to harm users? For example, a processor before executing RDRAND can check that the executing code behaves like a process of generating pseudorandom numbers for /dev/urandom, read what has already been generated and return the value that after the use of XOR to what we have will output the needed number. This way they can control the data /dev/urandom generates. If such output is used for generating cryptographic keys, they will be able to access those keys.

‘But come on, nobody would do it,’ you may say. ‘We are talking about a processor, a piece of hardware. After it’s been produced, nothing can be changed in it. They won’t try so hard to configure it.’ Well, I have bad news for you: what I’ve described earlier had already been implemented in Bochs, an emulator for several stock CPUs. I have good news as well: an emulator is not a processor; this modification wasn’t implemented in a processor. Oh, some more bad news: it actually can be modified in a processor even after a factory production stage. For more than 20 years processors have supported microcode updates – and one can download an update that will change some of the instructions.

Final good news: RDRAND is not used in Linux anymore, so we’re safe. Well, actually, disregard the last statement about safety – but that should be a topic for another article.

Do you like the subject? Read other articles from the series – about the problem of intermediate recursively enumerable Turing Degrees and about the challenge of modeling large quantum computers.