Willy,
On 2020-08-05 11:30 p.m., Willy Tarreau wrote:
On Wed, Aug 05, 2020 at 03:21:11PM -0700, Marc Plumb wrote:
There is nothing wrong with perturbing net_rand_state, the sin is doing it with the raw entropy that is also the seed for your CPRNG. Use the output of a CPRNG to perturb the pool all you want, but don't do things that bit by bit reveal the entropy that is being fed into the CPRNG.
This is interesting because I think some of us considered it exactly the other way around, i.e. we're not copying exact bits but just taking a pseudo-random part of such bits at one point in time, to serve as an increment among other ones. And given that these bits were collected over time from not very secret sources, they appeared to be of lower risk than the output.
No. The output of a CPRNG can't be used to determine the internal state. The input can. The input entropy is the one thing that cannot be produced by a deterministic computer, so they are the crown jewels of this. It's much much safer to use the output.
I mean, if we reimplemented something in parallel just mixing the IRQ return pointer and TSC, some people could possibly say "is this really strong enough?" but it wouldn't seem very shocking in terms of disclosure. But if by doing so we ended up in accident reproducing the same contents as the fast_pool it could be problematic.
Would you think that using only the input info used to update the fast_pool would be cleaner ? I mean, instead of :
fast_pool->pool[0] ^= cycles ^ j_high ^ irq; fast_pool->pool[1] ^= now ^ c_high; ip = regs ? instruction_pointer(regs) : _RET_IP_; fast_pool->pool[2] ^= ip; fast_pool->pool[3] ^= (sizeof(ip) > 4) ? ip >> 32 : get_reg(fast_pool, regs);
we'd do:
x0 = cycles ^ j_high ^ irq; x1 = now ^ c_high; x2 = regs ? instruction_pointer(regs) : _RET_IP_; x3 = (sizeof(ip) > 4) ? ip >> 32 : get_reg(fast_pool, regs); fast_pool->pool[0] ^= x0; fast_pool->pool[1] ^= x1; fast_pool->pool[2] ^= x2; fast_pool->pool[3] ^= x3;
this_cpu_add(net_rand_state.s1, x0^x1^x2^x3);
No. That's just as bad. There are two major problems:
It takes the entropy and sends it to the outside world without any strong crypto between the seed and the output. Reversing this isn't trivial, but it also isn't provably difficult.
It adds small amounts of entropy at a time and exposes it to the outside world. No crypto can make this safe (google "catastrophic reseeding"). If an attacker can guess the time within 1ms, then on a 4GHz CPU that's only 22 bits of uncertainty, so it's possible to brute force the input. Any details about which part of the fast_pool are used are irrelevant since that's determined by that input also, so it adds no security to this type of brute force attack. The only other part is the initial TSC offset, but if that were sufficient we wouldn't need the reseeding at all.
I didn't know about SFC32, it looks like a variation of the large family of xorshift generators, which is thus probably quite suitable as well for this task. Having used xoroshiro128** myself in another project, I found it overkill for this task compared to MSWS but I definitely agree that any of them is more suited to the task than the current one.
It's actually a chaotic generator (not a linear one like an xorshift generator), which gives it weaker period guarantees which makes it more difficult to reverse. With a counter added to help the period length.
I'll trust Amit that SFC32 isn't strong enough and look at other options -- I just thought of it as better, and faster than the existing one with the same state size. Maybe a larger state is needed.
Thanks,
Marc