In one of Quarkslab's projects, we came across the issue of randomizing a large set of integers, described as a list of disjoint intervals. These intervals can be represented as a sorted list of integers couples, like this one: \([1, 4], [10, 15], [17, 19], \dots\). The idea is to randomly and uniquely select numbers across these intervals, giving a shuffled list of numbers that belong to them. For instance, \([1,10,18,4,3,11,15,17,19,12,14,13,2]\) is a possible output. Moreover, each possible permutation of the integers set should have equal probability of appearance. If you're just interested in the final library that "do the job", go directly to the implementation section to download the leeloo C++ open-source library on Github !

## Trivial algorithm

The not-so-trivial (but still) algorithm is to generate an array containing all the original sorted integers, and then apply a shuffle algorithm (like Fisher–Yates [1]) that uses a common Pseudo Random Number Generator (PRNG). As an example, in C++, std::shuffle can be used to do that.

The main issue is that a buffer of n integers is required. For instance, with \(2^{31}\) 32-bit integers, one needs a buffer of 8GB, which is not acceptable in our situation.

## Other trivial algorithm

Another approach to reduce the memory footprint is to randomly select numbers between \([\![0, n [\![\) (using a classical PRNG), and keep a bitfield of already returned candidates not to return twice the same. When we start to reach too many times the same numbers, we change the algorithm:

- With \(R\) the remaining numbers of candidates to find, get a random number \([\![0, R [\![\) and find the position of the R-th bit not set in the bitfield. That can be optimized thanks to SSE instructions ;
- Set that bit and return the value ;
- Go on with \(R=R-1\) until \(R=0\).

There are multiple drawbacks with this algorithm:

- It still needs \(O(n)\) memory bytes (even if it is less than the previous algorithm) ;
- The final stage can be really slow if R is such that the remaining bitfield does not fit in cache.

See also [5] for a description of a similar algorithm.

## Problem reduction

Thus, the main issue is to generate a list of unique random numbers between a given \([\![0, n [\![\) interval, with good performances (say about 50 million numbers per seconds on a Core i7 3rd gen) and a small memory footprint (\(O(1)\)).

So, the final problem is to be able to choose in an equiprobable manner a permutation of \([\![0, n [\![\) among the \(n!\) ones (\(n!\) being the number of permutations of \(n\) distinct numbers [4]), using only \(O(1)\) bytes of memory (keeping in mind the performance criteria).

The first question is to understand if this is even feasible, and the second issue is to figure out an efficient method to achieve this.

## Formalization

Let's do some math to formalize this problem.

Some context: let \(n < 2^{32}\) and \(\{i \in [\![0,n[\![\}\) the numbers' set to shuffle ; \(n\) is always chosen as a prime number.

The choice of \(n\) as a prime induces interesting properties (as it is shown below), but the careful reader would notice that we won't always have a prime number of integers to generate. However, we can still live with that.

Indeed, let \(n\) the original number of integers to generate and \(p\) chosen as:

- \(p\) is prime ;
- \(p \geq n\) ;
- \(\forall i \in ⟧n,p[\![\), \(i\) is not a prime.

Or, in other words, \(p\) is the smallest prime number greater than or equals to \(n\).

That way, we will produce numbers between \([\![0,p[\![\), and not \([\![0,n[\![\). It is not really an issue because:

- when a number in \([\![n,p[\![\) is generated, just discard it and compute the next one until it belongs to \([\![0,n[\![\) ;
- the density of prime numbers in \([\![0,2^{32}[\![\) allows us to do that, as the maximal gap between two consecutive prime numbers is 354 [2].

We will now work in \(F_p = \mathbb{Z}/p\mathbb{Z}\). \(p\) being a prime, \(F_p\) is a division ring [3] (that's the great property).

Then, with \(S_p\) the set of permutations of \(F_p\), our problem is equivalent to choose with equal probability a permutation in \(S_p\).

## (Partial) resolution

All of that theory is nice, but it does not change a lot of things in concrete. Let's go now in the crux of the issue, and understand what can be done with \(S_p\) :)

### Permutation polynomial

One can notice that every application \(F_p \rightarrow F_p\) can be written as a polynomial of \(F_p[X]\). Indeed, for instance, given \(F\) an application, a chebytchev polynomial equivalent to \(F\) can always be found.

Thus, every element of \(S_p\) can be described as a polynomial of \(F_p[X]\).

### Trivial algorithm

That way, one (still not-so-trivial ;)) algorithm would be:

- Generate a random polynomial of \(F_p[X]\). This is equivalent to compute \(p\) random coefficients ;
- Check if this polynom represents a permutation ;
- If not, go back to the first step.

But wait... There are multiple issues here.

First, these \(p\) coefficients need to be stored in memory, giving a memory footprint of \(O(p)\) bytes.

Moreover, the problem of checking whether a polynomial is a permutation one or not can be somehow complex and slow. Probabilistic methods exist (shown in [6]), but it still leaves us with some potential errors. The performance cost of all of this could be important. We didn't take the time to benchmark this algorithm as it suffers from the \(O(p)\) memory issue...

And finally, left to generate a buffer of \(p\) integers, we could just stick to the first "trivial" algorithm described at the beginning of this paper.

### The real great stuff

We need to find a better way to generate these polynomials. We will use the \(F_p\) division ring properties.

Indeed, it can be demonstrated that, in \(F_p\), every permutation is a bijection, and every bijection is a permutation.

Thus, a whole set of polynomials can be described:

- for every \((a,b) \in (F_p^*,F_p), X \mapsto a*X+b\) is a bijection, and thus belongs to \(S_p\) (Equation 1)
- for every \(c\) such as \(gcd(c,p-1) = 1\), \(X^c\) is also a bijection [6], and belongs to \(S_p\) (Equation 2).

Moreover, as the combination of two bijection functions is a bijection, combining these two sets of polynomials will produce new ones.

What's even more interesting is that it can be demonstrated that, for every \(a \in F_p* \{X+1, a*X, X^{p-2}\}\) is a generator of the \(S_p\) group, using the composition law. [6]

So, the final result is that, theoretically, every permutation of \(S_p\) can be defined as a combination of these polynomials aforementioned.

### Entropy and equiprobability: how random is random?

For the following, we will defines three sets :

- \(G_a = F_p^*\), the values that can take a in (Equation 1) ;
- \(G_b = F_p\), the values that can take b in (Equation 1) ;
- \(G_c = \{c \in F_p \ / \ gcd(c,p-1)=1\}\), the values that can take \(c\) in (Equation 2).

Let's define these two applications:

\begin{align*} L : G_a \times G_b &\rightarrow S_p\\ (a,b) &\mapsto X \mapsto a*X + b \end{align*}

\begin{align*} G : G_c &\rightarrow S_p\\ c &\mapsto X \mapsto X^c \end{align*}

The first idea coming to our mind is to randomly combine the polynomials generated by these applications.

Let's define \(GS_p\) (\(S_p\) stands for 'seed part') as \(G_a \times G_b \times G_c\).

For instance, let's randomly choose \(S0=(a_0,b_0,c_0) \in GS_p\) and \(S1=(a_1,b_1,c_1) \in GS_p\). The couple \((S0,S1) \in GS=GS_p \times GS_p\) can be considered as the seed of our random number generator.

We know that \(L(a_0,b_0) \circ L(a_1,b_1)\) is a permutation polynomial, \(L(a_0,b_0) \circ G(c_0)\) is another one, \(G(c_0) \circ G(c_1)\) also, etc...

(Note: \(\circ\) is the function composition, which means that, for instance, \((L(a,b) \circ G(c))(X) = a*X^c+b\))

Thus, every 'seed' values that belongs to \(GS\) can produce a set of permutations.

Unfortunately, there are main issues with this approach, that we will call "entropy reduction". Indeed, we know that we can create permutations by composing \(L(a_0,b_0), L(a_1,b_1), G(c_0)\) and \(G(c_1)\), but :

- \(L(a_0,b_0) \circ L(a_1,b_1)\) can also be expressed as \(L(a_0*a_1, b_1*a_0+b_0)\). In other words, a combination of affine functions is an affine function). Moreover, as shown in Appendix A, choosing independently \(a_0\) and \(a_1\) in \(G_a\) and computing \(a_0*a_1\) is equivalent to randomly choose a number in \(G_a\). The same goes with \(b_1*a_0+b_0\). Thus, if we choose one seed \((S_0,S_1) \in (G_a \times G_b)^2\) and compute \(L(a_0,b_0) \circ L(a_1,b_1)\), this is equivalent to choose a seed \(S_0 \in (G_a \times G_b)\) ;
- The same issue comes with \(G(c_0) \circ G(c_1)\), which is equals to \(G(c_0*c_1)\) ;
- Even by combining \(L(a_0,b_0) \text{ with } G(c_0)\), then with \(L(a_1,b_1)\) and \(G(c_1)\), (giving \(L(a_1,b_1) \circ G(c_1) \circ L(a_0,b_0) \circ G(c_0)\)), the following question must be answered:

\begin{align*} \text{With } GS' = GS \times GS,\\ UPRNG: GS' &\rightarrow S_p\\ (a_0,b_0,c_0,a_1,b_1,c_1) &\mapsto L(a_1,b_1) \circ G(c_1) \circ L(a_0,b_0) \circ G(c_0) \end{align*}

is there any couple \((S_0,S_1) \in GS'xGS'\) such as \(UPRNG(S_0) = UPRNG(S_1)\) ?

Another way to formalize this problem is as follows: given a seed taking values in a space \(S \text{ of } s\) integers from \(\mathbb{Z}/p\mathbb{Z}\) (\(s\) unknown), is:

\begin{align*} UPRNG : S &\rightarrow S_p\\ seed &\mapsto \text{method to generate a permutation polynomial} \end{align*}

a bijective function?

Now, let's demonstrate a somehow intuitive result.

If \(F\) is a bijection, then \(\|S\| = \|S_p\|\), which gives \(s = p!\). This means that, in order to generate a random permutation of \(S_p\), we must choose a seed number between the \(p!\) ones. In other words, we must choose \(p\) unique random numbers. Well, this has just sent us back to the beginning of this article.

### Our method for compromises

But the game is not yet finished, we haven't gone this far for nothing. So let's work a bit with our results.

We now understand that, somehow, some compromises have to be made. We know that the size of the seed must be reduced. By doing this, we know that we won't be able to uniquely generate all the possible permutations of \(S_p\). Moreover, we want to do this in such a way that these properties will be conserved the best way:

- we still reach a fairly "reasonable" amount of permutations among \(S_p\) ;
- all these permutations are unique (or a "lot of" them) ;
- all of this has still "good" performances (we haven't talk yet a lot about this one, but we don't forget it :)).

At this point, we decided to study the following UPRNG (named \(UPRNGcomp\)):

\begin{align*} \text{With } GS = G_a \times G_b \times G_c \times N^*,\\ UPRNGcomp : GS &\rightarrow S_p\\ (a,b,c,n) &\mapsto (G(c) \circ L(a,b))^n \end{align*}

This choice is made because it produces a function that can be easily computed, and can still give interesting results.

### Number of generated permutations

If \(n\) is randomly chosen in \([\![1,N[\![\), then the number of generated permutations with this method, is : \(p*(p-1)*Phi(p-1)*N\) (with \(Phi\) the Euler totient function [3]).

As we've seen above, the number of unique generated permutations may be inferior to this.

Thus, if we have for instance \(n=2\), we can search for the set of \(seeds \in GS\) for which same UPRNG is the same.

Let

- \((S_0,S_1) \in G_a \times G_b \times Gc\) ;
- \(S_0 = (a_0,b_0,c_0)\) ;
- \(S_1 = (a_1,b_1,c_1)\).

We need to resolve:

The complete resolution of this equation being a bit human-time consuming, we'll do it with \(c_0=c_1=3\), and using mathematical software, we can find these solutions :

- obviously, \(\{a_0=a_1, b_0=b_1\}\) ;
- and \(\{a_0=p-a_1, b_0=b_1=0\}\).

Which means than, when \(b_0=b_1=0\), only half the numbers of possible values for \(a\) will give a unique permutation. By the way, this proves the fact that our UPRNG function isn't bijective.

We can test this easily with \(p=17\). \(gcd(3,17)\) being equals to 1, we can define:

\begin{align*} UPRNG: G_a \times G_b \times Gc &\rightarrow S_p\\ (a,b) &\mapsto G(3) \circ L(a,b) \circ G(3) \circ L(a,b) \end{align*}

And this python code:

```
def l(x,a,b,p): return (a*x+b)%p
def g(x,c,p): return (x**c)%p
def lgn(x,a,b,c,p,n):
for i in xrange(0,n):
x = g(l(x,a,b,p),c,p)
return x
list_x0 = list()
list_x1 = list()
p = 17
a = 5
b = 0
c = 3
n = 2
for x in range(0,p):
list_x0.append(lgn(x, a, b, c, p, 2))
list_x1.append(lgn(x, p-a, b, c, p, 2))
print(list_x0)
print(list_x1)
```

Which gives:

```
[0, 4, 8, 5, 16, 14, 10, 6, 15, 2, 11, 7, 3, 1, 12, 9, 13]
[0, 4, 8, 5, 16, 14, 10, 6, 15, 2, 11, 7, 3, 1, 12, 9, 13]
```

One possible solution is to reduce the space of \(G_a\), for instance with \(G_a=[1,\frac{p-1}{2}]\). But, without the full resolution of the (equation 1), this is just a partial resolution.

## Going further with another UPRNG

If we want to reduce the "entropy reduction", we need to improve the size of the seed values. For instance, this UPRNG could be defined as:

\begin{align*} UPRNG2 : GS = G_a \times G_b \times G_a \times G_b \times G_c \times N^* &\rightarrow S_p\\ (a_0,b_0,a_1,b_1,c,n) &\mapsto (L(a_1,b_1) \circ G(c_1) \circ L(a_0,b_0))^n \end{align*}

As above, we try to find \((S_0,S_1) \in GS\) such as \(UPRNG2(S_0) = UPRNG2(S_1)\).

Using mathematical resolution software, with \(n=1 \text{ and } c=9\) (for instance), this gives the following solutions:

Let

- \(S_0=(a_0,b_0,a_1,b_1)\) ;
- \(S_1=(a_2,b_2,a_3,b_3)\).

We have:

- \(S_0=S_1\) (trivial) ;
- \(\{a_1 = a_3*a_2^9*(a_0^{-1})^9, b_0 = b_2*a_0*(a_2^{-1}), b_1 = b_3\}\).

which gives constraints on the choice of our constants in order to try and have \(UPRNG2\) bijective.

The resolution of the full system is left for further work on the subject ;)

## Implementation and benchmarks

The implementaion of what's described here (and more) has been done in the C++ "leeloo" open-source library that you can find on github here : https://github.com/quarkslab/libleeloo. It also provides python bindings for python fans around here.

The library allows to manage integer intervals, aggregate them and randomly sort the elements as described in the introduction. It also provides an IPv4 range parser for convenience usage.

There are two main UPRNG implemented:

- one that uses the method described here [5]. This one is historical, optimised with SSE/AVX instructions and "fast" (see figures below) ;
- one that uses \(URPNGcomp\). It is 8 to 14 times slower that the original one (due to the modular exponentation), but provides a larger possible set of permutations.

Moreover, each UPRNG can be instantiated in "atomic" mode, which makes them thread-safe.

Some figures about performances : on a Core i7-3770 (3.4GHz, 4 cores with Hyperthreading), we obtain:

- with the first UPRNG, we can generate, with the SSE/AVX and parallelised version, about 290 millions of 32-bit numbers per second. This makes a memory bandwidth of about 1.2GB/s, making this generator CPU-bound (for now) ;
- with the second UPRNG, we can generate, with the parallelised version, about 30/n million numbers/s ('n' being the part of the seed that defines the number of compositon of \(G \circ L\).). This is because the performance of this generator is limited mainly by the modular exponentation computations. This generator is also clearly CPU-bound.

C++ and Python usage samples can be found on github at : https://github.com/quarkslab/libleeloo/tree/master/tools and https://github.com/quarkslab/libleeloo/tree/master/tests.

## Conclusion

Giving some compromises, we find a solution to our original problem that is actually good enough for our project needs. We still are a bit frustrated not to have the actual time to go further in this subject.

There exists other ways that haven't been studied here to generate permutation polynomial, as for instance described on this wikipedia page : http://en.wikipedia.org/wiki/Permutation_polynomial. This is also another interesting work that could be done :)

It can also be mentioned that some people already looked into the subject and published articles. For instance, [5] uses the quadratic residues with prime numbers. It can be noticed that the permutation given by this method can also be expressed as a permutation polynomial (but involves more computations).

Finally, thanks to Sebastien Kaczmarek (@deesse_k) for the original talks on the subject (and other ideas), to Ninon Eyrolles for her help on the redaction and some of the mathematics here and Kévin Szkudlapski for his advices.

## Going further

For the reader that might want to go further, here are some ideas:

- Work on UPRNG2 ;
- For the described UPRNGs, find out the number of unique permutations ;
- Benchmark and analyze other ways to generate permutation polynmials ;
- Something that would be nice: given a seed space \(S\) of size \(s\), find out \(S\) and a subset of \(S_p\) such as:

\begin{align*} F : S &\rightarrow \text{subset of }S_p\\ seed &\mapsto F(seed) \end{align*}

is a bijection.

You're welcome to send us feedbacks :).

### Appendix A

Let:

- \(p\) a prime number ;
- \(F_p = \mathbb{Z}/p\mathbb{Z}\) (which is a division ring) ;
- \(X \text{ and } Y\) two independant random variables of \(F_p\).

First, we have:

\begin{align*} P(X=x) = \frac{1}{p},\\ P(Y=y) = \frac{1}{p} \end{align*}

We have, for every \(n \in F_p\),

\begin{align*} P(X+Y=n) &= \sum_{x+y=n}(P(X=x)*P(Y=y)) \ \ \text{(X and Y are independant)}\\ &= \sum_{x+y=n}\frac{1}{p^2} \end{align*}

Let \(x \in F_p\) and \(y \in F_p\), such as \(x+y=n\), Thus, \(x\) can have \(p\) different values, and \(y=n-x\) is unique for a fixed \(x\). The number of \((x,y) \in F_p^2\) such as \(x+y=n\) is then \(p\).

So,

which means that choosing independently two random numbers in \(F_p\) and sum the two of them is equivalent to only choose one random number in \(F_p\).

Let's demonstrate the same result with \(X*Y\).

\begin{align*} P(X*Y=n) &= \sum_{x*y=n}(P(X=x)*P(Y=y)) \ \ \text{(X and Y are independant)}\\ &= \sum_{x*y=n}\frac{1}{p^2} \end{align*}

Let \(x \in F_p\) and \(y \in F_p\) such as \(x*y=n\),

Thus, \(x\) can have \(p\) different values. As \(F_p\) is a division ring, we have \(y=n*x^{-1}\) which is unique for a given \(x\). So, the number of \((x,y) \in F_p^2\) such as \(x*y=n \text{ is }p\), and

If we take these two results, and let \(X, Y \text{ and } Z \in F_p\),

\begin{align*} P(X*Y+Z=n) &= \sum_{x*y+z}(P(X=x)*P(Y=y)*P(Z=z))\\ &= \sum_{x*y+z=n} \frac{1}{p^3} \end{align*}

Let \(y\) and \(z \in F_p\), then \(x=(n-z)*y^{-1}\) exists and is unique for a given \(y\) and \(z\). Thus, the number of \((x,y,z) \in F_p^3\) such as \(x*y+z=n\) is \(p^2\), and:

## References

[1] | https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle |

[2] | https://en.wikipedia.org/wiki/Prime_gap#Numerical_results |

[3] | (1, 2) https://en.wikipedia.org/wiki/Euler%27s_totient_function |

[4] | https://en.wikipedia.org/wiki/Permutation |

[5] | (1, 2, 3) http://preshing.com/20121224/how-to-generate-a-sequence-of-unique-random-integers |

[6] | (1, 2, 3) http://tel.archives-ouvertes.fr/docs/00/43/87/65/PDF/LaigleChapuy_these.pdf |

## Comments