基尼辛普森指数衡量多样性 您所在的位置:网站首页 计算多样性指数 基尼辛普森指数衡量多样性

基尼辛普森指数衡量多样性

2024-07-08 14:09| 来源: 网络整理| 查看: 265

Simpson index

\lambda =\sum_{i=1}^{R}p_{i}^{2}

The measure equals the probability that two entities taken at random from the dataset (with replacement) represent the same type, where R is the total number of types in the dataset.

 

Gini–Simpson index

The transformation 1-\lambda equals the probability that the two entities represent different types.

分布越均衡,该指数越高;分布越集中,该指数越低。

 

Code

import pandas as pd def gini_calc(df2): sum_ = sum_square = 0 sum_ = df2['cnt'].sum() df2['cnt_prop']=df2['cnt'].apply(lambda x :x/sum_) for i in df2['cnt_prop']: sum_square += i**2 return 1-sum_square ################################ df = pd.read_excel('gini.xlsx') df=df.groupby([df['population'],df['subpopulation'],df['type']],as_index=False).sum() ################################ a=[] b=[] c=[] for name,group in df.groupby([df['population'],df['subpopulation']]): index = gini_calc(group) a.append(name[0]) b.append(name[1]) c.append(index) res={"population":a, "subpopulation":b, "gini_simpson_index":c} data=pd.DataFrame(res) result=data.to_csv('gini_result.csv')

 



【本文地址】

公司简介

联系我们

今日新闻

    推荐新闻

    专题文章
      CopyRight 2018-2019 实验室设备网 版权所有