scipy.stats.qmc.

Sobol#

class scipy.stats.qmc.Sobol(d, *, scramble=True, bits=None, rng=None, optimization=None)[source]#

用于生成（扰乱的）Sobol’序列的引擎。

Sobol’ 序列是低差异的，准随机数。可以使用两种方法绘制点

random_base2：安全地绘制 \(n=2^m\) 个点。此方法保证了序列的平衡属性。
random：从序列中绘制任意数量的点。请参阅下面的警告。

参数:

dint

序列的维度。最大维度为 21201。

scramblebool, optional

如果为 True，则使用 LMS+shift 扰乱。否则，不进行扰乱。默认为 True。

bitsint, optional

生成器的位数。控制可以生成的最大点数，即 2**bits。最大值为 64。它不对应于返回类型，返回类型始终为 np.float64，以防止点重复自身。默认为 None，为了向后兼容，它对应于 30。

1.9.0 版本新增。

optimization{None, “random-cd”, “lloyd”}, optional

是否使用优化方案来提高采样后的质量。请注意，这是一个后处理步骤，不能保证采样的所有属性都会被保留。默认为 None。

random-cd：坐标的随机排列以降低中心差异。基于中心差异的最佳样本会不断更新。与其他差异度量相比，基于中心差异的采样在 2D 和 3D 子投影中表现出更好的空间填充稳健性。
lloyd：使用修改后的 Lloyd-Max 算法扰动样本。该过程收敛到等间距的样本。

1.10.0 版本新增。

rngnumpy.random.Generator, optional

伪随机数生成器状态。当 rng 为 None 时，将使用来自操作系统的熵创建一个新的 numpy.random.Generator。除了 numpy.random.Generator 之外的类型将传递给 numpy.random.default_rng 以实例化 Generator。

在 1.15.0 版本中更改：作为从使用 numpy.random.RandomState 过渡到 numpy.random.Generator 的 SPEC-007 转换的一部分，此关键字已从 seed 更改为 rng。在过渡期间，两个关键字将继续工作，尽管一次只能指定一个关键字。在过渡期结束后，使用 seed 关键字的函数调用将发出警告。在弃用期之后，将删除 seed 关键字。

方法

`fast_forward`(n)	将序列快速前进 n 个位置。
`integers`(l_bounds, *[, u_bounds, n, ...])	从 l_bounds (包含) 到 u_bounds (不包含) 绘制 n 个整数，如果 endpoint=True，则从 l_bounds (包含) 到 u_bounds (包含)。
`random`([n, workers])	在半开区间 `[0, 1)` 中绘制 n。
`random_base2`(m)	从 Sobol' 序列中抽取点。
`reset`()	将引擎重置为基本状态。

注意

Sobol’ 序列 [1] 提供 \(n=2^m\) 个低差异点，位于 \([0,1)^{d}\) 中。扰乱它们 [3] 使它们适合于奇异积分，提供了一种误差估计的方法，并且可以提高它们的收敛速度。实现的扰乱策略是（左）线性矩阵扰乱 (LMS)，然后是数字随机移位 (LMS+shift) [2]。

Sobol’ 序列有很多版本，具体取决于它们的“方向数”。此代码使用来自 [4] 的方向数。因此，最大维度数为 21201。方向数已经使用搜索标准 6 进行了预先计算，并且可以在 https://web.maths.unsw.edu.au/~fkuo/sobol/ 检索。

警告

Sobol’ 序列是一种求积规则，如果使用的样本大小不是 2 的幂，或者跳过第一个点，或者减少序列 [5]，它们将失去其平衡属性。

如果 \(n=2^m\) 个点不够，则应取 \(2^M\) 个点，其中 \(M>m\)。扰乱时，独立副本的数量 R 不必是 2 的幂。

Sobol’ 序列生成到一些 \(B\) 位数。生成了 \(2^B\) 个点后，序列将重复。因此，会引发错误。位数可以使用参数 bits 控制。

参考

[1]

I. M. Sobol’, “The distribution of points in a cube and the accurate evaluation of integrals.” Zh. Vychisl. Mat. i Mat. Phys., 7:784-802, 1967.

[2]

J. Matousek, “On the L2-discrepancy for anchored boxes.” J. of Complexity 14, 527-556, 1998.

[3]

Art B. Owen, “Scrambling Sobol and Niederreiter-Xing points.” Journal of Complexity, 14(4):466-489, December 1998.

[4]

S. Joe and F. Y. Kuo, “Constructing sobol sequences with better two-dimensional projections.” SIAM Journal on Scientific Computing, 30(5):2635-2654, 2008.

[5]

Art B. Owen, “On dropping the first Sobol’ point.” arXiv:2008.08051, 2020.

示例

从 Sobol' 的低差异序列生成样本。

>>> from scipy.stats import qmc
>>> sampler = qmc.Sobol(d=2, scramble=False)
>>> sample = sampler.random_base2(m=3)
>>> sample
array([[0.   , 0.   ],
       [0.5  , 0.5  ],
       [0.75 , 0.25 ],
       [0.25 , 0.75 ],
       [0.375, 0.375],
       [0.875, 0.875],
       [0.625, 0.125],
       [0.125, 0.625]])

使用差异标准计算样本的质量。

>>> qmc.discrepancy(sample)
0.013882107204860938

要继续现有的设计，可以通过再次调用 random_base2 获取额外的点。或者，您可以跳过一些点，例如

>>> _ = sampler.reset()
>>> _ = sampler.fast_forward(4)
>>> sample_continued = sampler.random_base2(m=2)
>>> sample_continued
array([[0.375, 0.375],
       [0.875, 0.875],
       [0.625, 0.125],
       [0.125, 0.625]])

最后，可以将样本缩放到边界。

>>> l_bounds = [0, 2]
>>> u_bounds = [10, 5]
>>> qmc.scale(sample_continued, l_bounds, u_bounds)
array([[3.75 , 3.125],
       [8.75 , 4.625],
       [6.25 , 2.375],
       [1.25 , 3.875]])