KStwobign 分布#

这是从 \(n\) 个样本或观测值计算出的经验分布函数与比较（或目标）累积分布函数之间的归一化最大绝对差的极限分布。（ksone 是非归一化正差的分布，\(D_n^+\)。）

写作 \(D_n = \sup_t \left|F_{empirical,n}(t) - F_{target}(t)\right|\)，归一化因子是 \(\sqrt{n}\)，并且 kstwobign 是 \(\sqrt{n} D_n\) 值在 \(n\rightarrow\infty\) 时的极限分布。

请注意 \(D_n=\max(D_n^+, D_n^-)\)，但 \(D_n^+\) 和 \(D_n^-\) 不是独立的。

kstwobign 也可以用于两个经验分布函数之间的差异，用于分别具有 \(m\) 和 \(n\) 个样本的观测集，其中 \(m\) 和 \(n\) 都是“大”的。写作 \(D_{m,n} = \sup_t \left|F_{1,m}(t)-F_{2,n}(t)\right|\)，其中 \(F_{1,m}\) 和 \(F_{2,n}\) 是两个经验分布函数，那么 kstwobign 也是 \(\sqrt{\frac{mn}{m+n}}D_{m,n}\) 值在 \(m,n\rightarrow\infty\) 和 \(m/n\rightarrow a \ne 0, \infty\) 时的极限分布。

没有形状参数，支持度为 \(x\in\left[0,\infty\right)\)。

\begin{eqnarray*} F\left(x\right) & = & 1 - 2 \sum_{k=1}^{\infty} (-1)^{k-1} e^{-2k^2 x^2}\\ & = & \frac{\sqrt{2\pi}}{x} \sum_{k=1}^{\infty} e^{-(2k-1)^2 \pi^2/(8x^2)}\\ & = & 1 - \textrm{scipy.special.kolmogorov}(n, x) \\ f\left(x\right) & = & 8x \sum_{k=1}^{\infty} (-1)^{k-1} k^2 e^{-2k^2 x^2} \end{eqnarray*}

参考资料#

“Kolmogorov-Smirnov 测试”，维基百科 https://en.wikipedia.org/wiki/Kolmogorov-Smirnov_test
Kolmogoroff, A. “Confidence Limits for an Unknown Distribution Function.”” Ann. Math. Statist. 12 (1941), no. 4, 461–463.
Smirnov, N. “On the estimation of the discrepancy between empirical curves of distribution for two independent samples” Bull. Math. Univ. Moscou., 2 (1039), 2-26.
Feller, W. “On the Kolmogorov-Smirnov Limit Theorems for Empirical Distributions.” Ann. Math. Statist. 19 (1948), no. 2, 177–189. and “Errata” Ann. Math. Statist. 21 (1950), no. 2, 301–302.

实现： scipy.stats.kstwobign