解密随机数生成器(二)——从java源码看线性同余算法--688IT编程网

解密随机数⽣成器（⼆）——从java源码看线性同余算法Random

Java中的Random类⽣成的是伪随机数，使⽤的是48-bit的种⼦，然后调⽤⼀个linear congruential formula线性同余⽅程（Donald Knuth的编程艺术的3.2.1节）

如果两个Random实例使⽤相同的种⼦，并且调⽤同样的函数，那么⽣成的sequence是相同的

也可以调⽤Math.random()⽣成随机数

Random实例是线程安全的，但是并发使⽤Random实例会影响效率，可以考虑使⽤urrent.ThreadLocalRandom(jdk1.7)。

/**

* A random number generator isolated to the current thread. Like the

* global {@link java.util.Random} generator used by the {@link

* java.lang.Math} class, a {@code ThreadLocalRandom} is initialized

* with an internally generated seed that may not otherwise be

* modified. When applicable, use of {@code ThreadLocalRandom} rather

* than shared {@code Random} objects in concurrent programs will

* typically encounter much less overhead and contention. Use of

* {@code ThreadLocalRandom} is particularly appropriate when multiple

* tasks (for example, each a {@link ForkJoinTask}) use random numbers

* in parallel in thread pools.

* Usages of this class should typically be of the form:

* {@code ThreadLocalRandom.current().nextX(...)} (where

* {@code X} is {@code Int}, {@code Long}, etc).

* When all usages are of this form, it is never possible to

* accidently share a {@code ThreadLocalRandom} across multiple threads.

* This class also provides additional commonly used bounded random

* generation methods.

* Instances of {@code ThreadLocalRandom} are not cryptographically

* secure. Consider instead using {@link java.security.SecureRandom}

* in security-sensitive applications. Additionally,

* default-constructed instances do not use a cryptographically random

* seed unless the {@linkplain System#getProperty system property}

* {@code java.util.secureRandomSeed} is set to {@code true}.

* @since 1.7

* @author Doug Lea

public class ThreadLocalRandom extends Random {

int nextInt = ThreadLocalRandom.current().nextInt(10);

Random实例不是安全可靠的加密，可以使⽤java.security.SecureRandom来提供⼀个可靠的加密。

Random implements Serializable 可序列化的

AtomicLong seed 原⼦变量

解密随机数⽣成器（2）——从java源码看线性同余算法

上篇博客中，我们了解了基于物理现象的真随机数⽣成器，然⽽，真随机数产⽣速度较慢，为了实际计算需要，计算机中的随机数都是由程

序算法，也就是某些公式函数⽣成的，只不过对于同⼀随机种⼦与函数，得到的随机数列是⼀定的，因此得到的随机数可预测且有周期，不

能算是真正的随机数，因此称为伪随机数（Pseudo Random Number）。

不过，别看到伪字就瞧不起，这⾥⾯也是有学问的，看似⼏个简简单单的公式可能是前辈们努⼒了⼏代的成果，相关的研究可以写好⼏本书

了！

顺便提⼀下，亚裔唯⼀图灵奖得主姚期智，研究的就是伪随机数⽣成论（The pseudo random number generating theory）。

在这⾥，我重点介绍两个常⽤的算法：同余法（Congruential method）和梅森旋转算法（Mersenne twister）

1、同余法

同余法（Congruential method）是很常⽤的⼀种随机数⽣成⽅法，在很多编程语⾔中有应⽤，最明显的就是java了，java.util.Random类中⽤的就是同余法中的⼀种——线性同余法（Linear congruential

method），除此之外还有乘同余法（Multiplicative congruential method）和混合同余法（Mixed congruential method）。好了，现在我们就打开java的源代码，看⼀看线性同余法的真⾯⽬！

在Eclipse中输⼊java.util.Random，按F3转到Random类的源代码：

⾸先，我们看到这样⼀段说明：

翻译过来是：

这个类的⼀个实现是⽤来⽣成⼀串伪随机数。这个类⽤了⼀个48位的种⼦，被线性同余公式修改⽤来⽣成随机数。（见Donald Kunth《计算机编程的艺术》第⼆卷，章节3.2.1）

显然，java的Random类使⽤的是线性同余法来得到随机数的。

接着往下看，我们到了它的构造函数与⼏个⽅法，⾥⾯包含了获得48位种⼦的过程：

private static final AtomicLong seedUniquifier = new AtomicLong(8682522807148012L);

/**

* Creates a new random number generator. This constructor sets

* the seed of the random number generator to a value very likely

* to be distinct from any other invocation of this constructor.

public Random() {

this(seedUniquifier() ^ System.nanoTime());

}

private static long seedUniquifier() {

// L'Ecuyer, "Tables of Linear Congruential Generators of

// Different Sizes and Good Lattice Structure", 1999

for (;;) {

long current = ();

long next = current * 181783497276652981L;

if (seedUniquifierpareAndSet(current, next))

return next;

}

private static final AtomicLong seedUniquifier

= new AtomicLong(8682522807148012L);

public Random(long seed) {

if (getClass() == Random.class)

this.seed = new AtomicLong(initialScramble(seed));

else {

/ subclass might have overriden setSeed

this.seed = new AtomicLong();

setSeed(seed);

}

private static long initialScramble(long seed) {

return (seed ^ multiplier) & mask;

}

urrent.atomic.AtomicLong

public final boolean compareAndSet(long expect,

long update)

Atomically sets the value to the given updated value if the current value == the expected value.

Parameters:

expect - the expected value

update - the new value

Returns:

true if successful. False return indicates that the actual value was not equal to the expected value.

这⾥使⽤了System.nanoTime()⽅法来得到⼀个纳秒级的时间量，参与48位种⼦的构成，然后还进⾏了⼀个很变态的运算——不断乘以181783497276652981L，直到某⼀次相乘前后结果相同——来进⼀步增⼤随机性，这⾥的nanotime可以算是⼀个真随机数，不过有必要提的是，nanoTime和我们常⽤的currenttime⽅法不同，返回的不是从1970年1⽉1⽇到现在的时间，⽽是⼀个随机的数——只⽤来前后⽐较计算⼀个时间段，⽐如⼀⾏代码的运⾏时间，数据库导⼊的时间等，⽽不能⽤来计算今天是哪⼀天。

/**

* Returns the current value of the running Java Virtual Machine's

* high-resolution time source, in nanoseconds.

* This method can only be used to measure elapsed time and is

* not related to any other notion of system or wall-clock time.

* The value returned represents nanoseconds since some fixed but

* arbitrary origin time (perhaps in the future, so values

* may be negative). The same origin is used by all invocations of

* this method in an instance of a Java virtual machine; other

* virtual machine instances are likely to use a different origin.

* This method provides nanosecond precision, but not necessarily

* nanosecond resolution (that is, how frequently the value changes)

* - no guarantees are made except that the resolution is at least as

* good as that of {@link #currentTimeMillis()}.

* Differences in successive calls that span greater than

* approximately 292 years (263 nanoseconds) will not

* correctly compute elapsed time due to numerical overflow.

* The values returned by this method become meaningful only when

* the difference between two such values, obtained within the same

* instance of a Java virtual machine, is computed.

* For example, to measure how long some code takes to execute:

* <pre> {@code

* long startTime = System.nanoTime();

* // ... the code being measured ...

* long estimatedTime = System.nanoTime() - startTime;}</pre>

* To compare two nanoTime values

* <pre> {@code

* long t0 = System.nanoTime();

* ...

* long t1 = System.nanoTime();}</pre>

* one should use {@code t1 - t0 < 0}, not {@code t1 < t0},

* because of the possibility of numerical overflow.

* @return the current value of the running Java Virtual Machine's

* high-resolution time source, in nanoseconds

* @since 1.5

public static native long nanoTime();

好了，现在我不得不佩服这位⼯程师的变态了：到⽬前为⽌，这个程序已经⾄少进⾏了三次随机：

1、获得⼀个长整形数作为“初始种⼦”（系统默认的是8682522807148012L）

2、不断与⼀个变态的数——181783497276652981L相乘（天知道这些数是不是⼯程师随便滚键盘滚出来的-.-）得到⼀个不能预测的值，直到能把这个不能事先预期的值赋给Random对象的静态常量seedUniquifier 。因为多线程环境下赋值操作可能失败，就for(;;)来保证⼀定要赋值成功

3、与系统随机出来的nanotime值作异或运算，得到最终的种⼦

再往下看，就是我们常⽤的得到随机数的⽅法了，我⾸先到了最常⽤的nextInt（）函数，代码如下：

public int nextInt() {

return next(32);

}

代码很简洁，直接跳到了next函数：

protected int next(int bits) {

long oldseed, nextseed;

AtomicLong seed = this.seed;

do {

oldseed = ();

nextseed = (oldseed * multiplier + addend) & mask;

} while (!seedpareAndSet(oldseed, nextseed));

return (int)(nextseed >>> (48 - bits));

}

OK，祝贺⼀下怎么样，因为我们已经深⼊到的线性同余法的核⼼了——没错，就是这⼏⾏代码！

在分析这段代码前，先来简要介绍⼀下线性同余法。

在程序中为了使表达式的结果⼩于某个值，我们常常采⽤取余的操作，结果是同⼀个除数的余数，这种⽅法叫同余法（Congruential method）。

线性同余法是⼀个很古⽼的随机数⽣成算法，它的数学形式如下：

Xn+1 = (a*Xn+c)(mod m)

其中，

m>0,0<a<m,0<c<m

这⾥Xn这个序列⽣成⼀系列的随机数，X0是种⼦。随机数产⽣的质量与m，a，c三个参数的选取有很⼤关系。这些随机数并不是真正的随机，⽽是满⾜在某⼀周期内随机分布，这个周期的最长为m。根据Hull-Dobell Theorem，当且仅当：

1. c和m互素;

2. a-1可被所有m的质因数整除;

3. 当m是4的整数倍，a-1也是4的整数倍时，周期为m。所以m⼀般都设置的很⼤，以延长周期。

现在我们回过头来看刚才的程序，注意这⾏代码：

nextseed = (oldseed * multiplier + addend) & mask;

和Xn+1=(a*Xn+c)(mod m)的形式很像有⽊有！

没错，就是这⼀⾏代码应⽤到了线性同余法公式！不过还有⼀个问题：怎么没见取余符号？嘿嘿，先让我们看看三个变量的数值声明：

private static final long multiplier = 0x5DEECE66DL;

private static final long addend = 0xBL;

private static final long mask = (1L << 48) - 1;

其中multiplier和addend分别代表公式中的a和c，很好理解，但mask代表什么呢？其实，x & [(1L << 48)–1]与 x（mod 2^48）等价。解释如下：

x对于2的N次幂取余，由于除数是2的N次幂，如：

0001，0010，0100，1000。。。。

相当于把x的⼆进制形式向右移N位，此时移到⼩数点右侧的就是余数，如：

13 = 1101 8 = 1000

13 / 8 = 1.101，所以⼩数点右侧的101就是余数，化成⼗进制就是5

然⽽，⽆论是C语⾔还是java，位运算移⾛的数显然都⼀去不复返了。（什么，你说在CF寄存器中？好吧，太⾼端了点，其实还有更给⼒的⽅法）有什么好办法保护这些即将逝去的数据呢？

学着上⾯的mask，我们不妨试着把2的N次幂减⼀：

0000，0001，0011，0111，01111，011111。。。

怎么样，有启发了吗？

我们知道，某个数（限0和1）与1作与（&）操作，结果还是它本⾝；⽽与0作与操作结果总是0，即：

a & 1 = a, a & 0 = 0

⽽我们将x对2^N取余操作希望达到的⽬的可以理解为：

1、所有⽐2^N位（包括2^N那⼀位）全都为0

2、所有⽐2^N低的位保持原样

因此， x & （2^N-1）与x（mod 2^N）运算等价，还是13与8的例⼦：

1101 % 1000 = 0101 1101 & 0111 = 0101

⼆者结果⼀致。

嘿嘿，讲明⽩了这个与运算的含义，我想上⾯那⾏代码的含义应该很明了了，就是线性同余公式的直接套⽤，其中a = 0x5DEECE66DL, c = 0xBL, m = 2^48，就可以得到⼀个48位的随机数，⽽且这个谨慎的⼯程师进⾏了迭代，增加结果的随机性。再把结果移位，就可以得到指定位数的随机数。

接下来我们研究⼀下更常⽤的⼀个函数——带参数n的nextInt：java源代码加密

public int nextInt(int n) {

if (n <= 0)

throw new IllegalArgumentException("n must be positive");

if ((n & -n) == n) // i.e., n is a power of 2

return (int)((n * (long)next(31)) >> 31);

int bits, val;

do {

bits = next(31);

val = bits % n;

} while (bits - val + (n-1) < 0);

return val;

}

显然，这⾥基本的思路还是⼀样的，先调⽤next函数⽣成⼀个31位的随机数（int类型的范围），再对参数n进⾏判断，如果n恰好为2的⽅幂，那么直接移位就可以得到想要的结果；如果不是2的⽅幂，那么就关于n取余，最终使结果在[0,n)范围内。另外，do-while语句的⽬的应该是防⽌结果为负数。

你也许会好奇为什么(n & -n) == n可以判断⼀个数是不是2的次⽅幂，其实我也是研究了⼀番才弄明⽩

的，其实，这主要与补码的特性有关：众所周知，计算机中负数使⽤补码储存的（不懂什么是补码的⾃⼰百度恶补），举⼏组例⼦：

2 ：0000 0010 -2 ：1111 1110

8 ：0000 1000 -8 ：1111 1000

18 ：0001 0010 -18 ：1110 1110

20 ：0001 0100 -20 ：1110 1100

不知道⼤家有没有注意到，补码有⼀个特性，就是可以对于两个相反数n与-n，有且只有最低⼀个为1的位数字相同且都为1，⽽更低的位全为0，更⾼的位各不相同。因此两数作按位与操作后只有⼀位为1，⽽能满⾜这个结果仍为n的只能是原本就只有⼀位是1的数，也就是恰好是2的次⽅幂的数了。

不过个⼈觉得还有⼀种更好的判断2的次⽅幂的⽅法：

n & (n-1) == 0

感兴趣的也可以⾃⼰研究⼀下^o^。

好了，线性同余法就介绍到这了，下⾯简要介绍⼀下另⼀种同余法——乘同余法（Multiplicative congruential method）。

上⽂中的线性同余法，主要⽤来⽣成整数，⽽某些情景下，⽐如科研中，常常只需要（0，1）之间的⼩数，这时，乘同余法是更好的选择，它的基本公式和线性同余法很像：

Xn+1=（a*Xn ）(mod m ）

其实只是令线性公式中的c=0⽽已。只不过，为了得到⼩数，我们多做⼀步：

Yn = Xn/m

由于Xn是m的余数，所以Yn的值介于0与1之间，由此到（0，1）区间上的随机数列。

除此之外，还有混合同余法，⼆次同余法，三次同余法等类似的⽅法，公式类似，也各有优劣，在此不详细介绍了。

同余法优势在计算速度快，内存消耗少。但是，因为相邻的随机数并不独⽴，序列关联性较⼤。所以，对于随机数质量要求⾼的应⽤，特别是很多科研领域，并不适合⽤这种⽅法。

不要⾛开，下篇博客介绍⼀个更给⼒的算法——梅森旋转算法（Mersenne Twister），持续关注啊！

ption/program/1609435.html

688IT编程网

解密随机数生成器(二)——从java源码看线性同余算法

发表评论

推荐文章

java正则表达式选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

热门文章

excel文字递增函数公式

数字递增公式

notepad 正则变量运算

C++regex库常用函数及实例

js正则表达式之前瞻后顾与非捕获分组

indesign正则数字和英文之间的空格

C#匹配中文字符串的4种正则表达式分享

PHP正则表达式匹配中文字符

匹配中文汉字的正则表达式介绍

Python正则表达式如何进行字符串替换

orcl中用正则表达式

sql正则表达式excel

dataframe正则表达式

postgress sql正则

el-upload accept 正则表达式

半小时正则表达式

判断科学计数法的正则

根据url判断静态资源的方法

Java正则表达式-匹配正负浮点数

替换模糊匹配正则-hive

最新文章

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

能被5整除的十进制整数的正规表达式

大于0小于等于1的正则表达式

linux grep 26个字母

java pattern 正则表达式

掌握文本编辑器中的搜索和替换技巧

标签列表

688IT编程网

解密随机数生成器(二)——从java源码看线性同余算法

发表评论

推荐文章

java正则表达式 选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

热门文章

excel文字递增函数公式

数字递增公式

notepad 正则变量运算

C++regex库常用函数及实例

js正则表达式之前瞻后顾与非捕获分组

indesign正则数字和英文之间的空格

C#匹配中文字符串的4种正则表达式分享

PHP正则表达式匹配中文字符

匹配中文汉字的正则表达式介绍

Python正则表达式如何进行字符串替换

orcl中用正则表达式

sql正则表达式excel

dataframe正则表达式

postgress sql正则

el-upload accept 正则表达式

半小时 正则表达式

判断科学计数法的正则

根据url判断静态资源的方法

Java正则表达式-匹配正负浮点数

替换模糊匹配正则-hive

最新文章

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

能被5整除的十进制整数的正规表达式

大于0小于等于1的正则表达式

linux grep 26个字母

java pattern 正则表达式

掌握文本编辑器中的搜索和替换技巧

标签列表

java正则表达式选择题

非零金额正则表达式

半小时正则表达式