在 Python 中计算合并标准差

2024-01-23 00:41| 来源: 网络整理| 查看: 265

我们非常清楚标准差用于衡量数据集中数字的分布。较小的标准偏差表明元素中的偏差与数据集的平均值相比非常小或非常微不足道。较大的偏差表明项目与其在数据集中的平均值存在显着或较大的差异。

我们可以使用 Python 计算标准偏差，我们将在此处看到。在 Python 3.x 中，我们获得了大量用于统计计算的库。 Python 的统计数据是用于描述性统计的内置 Python 库。如果我们的数据集不是太大或者我们不能简单地依赖导入其他库，我们可以使用它。

合并标准差：

合并标准偏差是两个或多个组的标准偏差的加权平均值。单个标准偏差被平均，更大的样本量被赋予更多的“权重”。

这是科恩的替代公式供参考：

SDpooled = √((n1-1).SD12 + (n2-1).SD22)/(n1+n2-2)

在哪里，

SD1 = 第 1 组的标准偏差 SD2 = 第 2 组的标准偏差 n1 = 第 1 组的样本量 n2 = 第 2 组的样本量

对于相同大小的样本，它就变成了，

SDpooled = √(SD12 + SD22)/2

计算步骤：

导入统计数据(用于 python 标准差库) 导入数学(计算 sqrt) 使用 python 中的 len 函数确定样本的长度(比如 n1 = len(sample1)) 计算样本的标准差(例如sample1，使用statistics.stdev(sample1)) 最后使用公式计算样本的合并标准偏差。

Pooled standard deviation = √ (n1-1)sample12 + (n2-1)sample22 / (n1+n2-2)

注意：如果样本为空，则会引发StatisticsError。

第 1 步：让我们用一个例子来试试这个：

首先我们导入所需的模块。然后，假设我们有两个样本，sample1 = [4, 5, 6] 和 sample2 = [10, 12, 14, 16, 18, 20]。现在，statistics.stdev(sample1) 计算它的标准偏差(基本上，statistics.stdev() 函数计算 Python 中值列表的样本标准偏差)。 Python3实现

# import module import math import statistics sample1 = [4, 5, 6] # Computing sample standard deviation for sample1 SD1 = statistics.stdev(sample1) print("Standard Deviation for 1st sample = ", SD1) sample2 = [10, 12, 14, 16, 18, 20] # Computing sample standard deviation for sample2 SD2 = statistics.stdev(sample2) print("Standard Deviation for 2nd sample = ", SD2)

输出：

Standard Deviation for 1st sample = 1.0 Standard Deviation for 2nd sample = 3.7416573867739413

第 2 步：然后，让我们使用 Python 中的 len 函数计算样本的长度

Python3实现

import math import statistics sample1 = [4, 5, 6] # Computing sample standard deviation for sample1 SD1 = statistics.stdev(sample1) sample2 = [10, 12, 14, 16, 18, 20] # Computing sample standard deviation for sample2 SD2 = statistics.stdev(sample2) # calculate length of 1st sample n1 = len(sample1) # calculate length of 2nd sample n2 = len(sample2) print("sample1 : length = ", n1, " | S.D. = ", SD1) print("sample2 : length = ", n2, " | S.D. = ", SD2)

输出：

sample1 : length = 3 | S.D. = 1.0 sample2 : length = 6 | S.D. = 3.7416573867739413

第 3 步：最后，我们使用上述公式计算合并标准差。

Python3实现

import math import statistics sample1 = [4, 5, 6] # Computing sample standard deviation for sample1 SD1 = statistics.stdev(sample1) sample2 = [10, 12, 14, 16, 18, 20] # Computing sample standard deviation for sample2 SD2 = statistics.stdev(sample2) # calculate length of 1st sample n1 = len(sample1) # calculate length of 2nd sample n2 = len(sample2) pooled_standard_deviation = math.sqrt( ((n1 - 1)*SD1 * SD1 + (n2-1)*SD2 * SD2) / (n1 + n2-2)) print("Pooled Standard Deviation = ", pooled_standard_deviation)

输出：

Pooled Standard Deviation = 3.2071349029490928

【本文地址】

公司简介

联系我们