给定概率密度,生成随机数python实现--688IT编程网

给定概率密度，⽣成随机数python实现

实现的⽅法可以不⽌⼀种：

1. rejection sampling

2. invert the cdf

3. Metropolis Algorithm (MCMC)

⽬标：

已知 y=pdf(x），现想由给定的pdf, ⽣成对应分布的x

PDF是概率分布函数，对其积分或者求和可以得到CDF（累积概率分布函数），PDF积分或求和的结果始终为1

步骤（具体解释后⾯会说）：

1、根据pdf得到cdf

2、由cdf得到inverse of the cdf

3、对于给定的均匀分布[0,1)，带⼊inverse cdf，得到的结果即是我们需要的x

求cdf逆函数的具体⽅法：

对于上⾯的第⼆步，可以分成两类：

1、当CDF的逆函数好求时，直接根据公式求取，

2、反之当CDF的逆函数不好求时，⽤数值模拟⽅法

⾃⼰的理解：为什么需要根据cdf的逆去获得x？

原因⼀：

因为cdf是单调函数因此⼀定存在逆函数（cdf是s型函数，⽽pdf则不⼀定，例如正态分布，不单调，对于给定的y，可能存在两个对应的x，就不可逆）

原因⼆：

这仅是我⾃⼰的直观理解，根据下图所⽰（左上为pdf，右上为cdf）

由步骤3可知，我们⾸先⽣成[0，1)的均匀随机数，此随机数作为cdf的y，去映射到cdf的x（若⽤cdf的逆函数表⽰则是由x映射到y），可以参考上图的右上，既然cdf的y是均匀随机的，那么对于cdf中同样范围的x，斜率⼤的部分将会有更⼤的机会被映射，因为对应的y范围更⼤（⽽y是随即均匀分布的），那么，cdf的斜率也就等同于pdf的值，这正好符合若x的pdf较⼤，那么有更⼤的概率出现（即重复很多次后，该x会出现的次数最多）

代码实现——⽅法⼀，公式法

import numpy as np

import math

import random

import matplotlib.pyplot as plt

import collections

count_dict =dict()

bin_count =20

def inverseCDF():

"""

return the x value in PDF

"""

uniform_random = random.random()

return inverse_cdf(uniform_random)

def pdf(x):

return2* x

# cdf = x^2, 其逆函数很好求，因此直接⽤公式法

def inverse_cdf(x):

return math.sqrt(x)

def draw_pdf(D):

global bin_count

D = collections.OrderedDict(sorted(D.items()))

plt.bar(range(len(D)),list(D.values()), align='center') # 因为映射bin的时候采⽤的floor操作，因此加上0.5

value_list =[(key +0.5)/ bin_count for key in D.keys()] icks(range(len(D)), value_list)

plt.xlabel('x', fontsize=5)

plt.ylabel('counts', fontsize=5)

plt.title('counting bits')

plt.show()

for i in range(90000):

x = inverseCDF()

# ⽤bin去映射，否则不好操作

bin= math.floor(x * bin_count)# type(bin): int

count_dict[bin]= (bin,0)+1

draw_pdf(count_dict)

结果：

python生成1到100之间随机数

代码实现——⽅法⼆，数值法

数值模拟cdf的关键是创建lookup table，

table的size越⼤则结果越真实（即区间划分的个数）

import numpy as np

import math

import random

import matplotlib.pyplot as plt

import collections

lookup_table_size =40

CDFlookup_table = np.zeros((lookup_table_size))

count_dict =dict()

bin_count =20

def inverse_cdf_numerically(y):

global lookup_table_size

global CDFlookup_table

value =0.0

for i in range(lookup_table_size):

x = i *1.0/(lookup_table_size -1)

value += pdf2(x)

CDFlookup_table[i]= value

CDFlookup_table /= value # normalize the cdf

if y < CDFlookup_table[0]:

t = y / CDFlookup_table[0]

return t / lookup_table_size

index =-1

for j in range(lookup_table_size):

if CDFlookup_table[j]>= y:

index = j

break

# linear interpolation

t =(y - CDFlookup_table[index -1])/ \

(CDFlookup_table[index]- CDFlookup_table[index -1])

fractional_index = index + t # 因为index从0开始，所以不是 (index-1)+t return fractional_index / lookup_table_size

def inverseCDF():

"""

return the x value in PDF

"""

uniform_random = random.random()

return inverse_cdf_numerically(uniform_random)

def pdf2(x):

return(x * x * x -10.0* x * x +5.0* x +11.0)/(10.417)

def draw_pdf(D):

global bin_count

D = collections.OrderedDict(sorted(D.items()))

plt.bar(range(len(D)),list(D.values()), align='center') value_list =[(key +0.5)/ bin_count for key in D.keys()] icks(range(len(D)), value_list)

plt.xlabel('x', fontsize=5)

plt.ylabel('counts', fontsize=5)

plt.title('counting bits')

plt.show()

for i in range(90000):

x = inverseCDF()

bin= math.floor(x * bin_count)# type(bin): int

count_dict[bin]= (bin,0)+1

draw_pdf(count_dict)

真实函数与模拟结果

扩展：⽣成伯努利、正太分布

import numpy as np

import matplotlib.pyplot as plt

"""

reference:

/2017/07/25/counting-bits-the-normal-distribution/

"""

def plot_bar_x():

# this is for plotting purpose

index = np.arange(counting.shape[0])

plt.bar(index, counting)

plt.xlabel('x', fontsize=5)

plt.ylabel('counts', fontsize=5)

plt.title('counting bits')

plt.show()

# if dice_side=2, is binomial distribution

# if dice_side>2 , is multinomial distribution

dice_side =2

# if N becomes larger, then multinomial distribution will more like normal distribution N =100

counting = np.zeros(((dice_side -1)* N +1))

for i in range(30000):

sum=0

for j in range(N):

dice_result = np.random.randint(0, dice_side)

sum+= dice_result

counting[sum]+=1

# normalization

counting /= np.sum(counting)

plot_bar_x()

688IT编程网

给定概率密度,生成随机数python实现

发表评论

推荐文章

java正则表达式选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

热门文章

利用正则表达式实现文本数据提取与处理

正则表达式零宽断言详解

文本匹配规则

excel中使用正则

1-31正则表达式

anki之高级筛选

BUAA_OO_2021_第一单元总结

insert语句递增写法

sublime text 3在行前插入递增数字序号的方法

字符串只允许数字和英文的正则

powerbuilder 正则表达式

Shell脚本编写的高级技巧利用正则表达式进行字符串匹配

JAVA正则表达式的三种模式:贪婪,勉强和占有的讨论

go regexp匹配规则

oracle regexp_substr 实现原理

基本的元字符回溯引用和前后查匹配模式

elasticsearch query dsl正则

oracle sql正则表达式

GA-设置目标

仅匹配全角片假名的正则表达式

最新文章

java正则表达式选择题

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

vue数字相加小数点变长-概述说明以及解释

vue validate 正则验证小数长度

标签列表

688IT编程网

给定概率密度,生成随机数python实现

发表评论

推荐文章

java正则表达式 选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

热门文章

利用正则表达式实现文本数据提取与处理

正则表达式零宽断言详解

文本匹配规则

excel中使用正则

1-31正则表达式

anki之高级筛选

BUAA_OO_2021_第一单元总结

insert语句递增写法

sublime text 3在行前插入递增数字序号的方法

字符串只允许数字和英文的正则

powerbuilder 正则表达式

Shell脚本编写的高级技巧利用正则表达式进行字符串匹配

JAVA正则表达式的三种模式:贪婪,勉强和占有的讨论

go regexp匹配规则

oracle regexp_substr 实现原理

基本的元字符 回溯引用和前后查 匹配模式

elasticsearch query dsl正则

oracle sql正则表达式

GA-设置目标

仅匹配全角片假名的正则表达式

最新文章

java正则表达式 选择题

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

vue数字相加小数点变长-概述说明以及解释

vue validate 正则验证小数长度

标签列表

java正则表达式选择题

非零金额正则表达式

基本的元字符回溯引用和前后查匹配模式

java正则表达式选择题

非零金额正则表达式