pandas学习之一：excel转字典

2024-07-12 12:46| 来源: 网络整理| 查看: 265

1.to_dict() 函数基本语法 DataFrame.to_dict (self, orient='dict' , into= ) --- 官方文档

函数种只需要填写一个参数：orient 即可，但对于写入orient的不同，字典的构造方式也不同，官网一共给出了6种，并且其中一种是列表类型：

orient ='dict'，是函数默认的，转化后的字典形式：{column(列名) : {index(行名) : value(值) )}}；orient ='list' ，转化后的字典形式：{column(列名) :{[ values ](值)}};orient ='series' ，转化后的字典形式：{column(列名) : Series (values) (值)};orient ='split' ，转化后的字典形式：{'index' : [index]，‘columns' :[columns]，’data‘ : [values]};orient ='records' ，转化后是 list形式：[{column(列名) : value(值)}......{column:value}];orient ='index' ，转化后的字典形式：{index(值) : {column(列名) : value(值)}};

备注：

1，上面中 value 代表数据表中的值，column表示列名，index 表示行名，如下图所示：

2，{ }表示字典数据类型，字典中的数据是以 {key : value} 的形式显示，是键名和键值一一对应形成的。

2，关于6种构造方式进行代码实例

六种构造方式所处理 DataFrame 数据是统一的，如下：

>>> import pandas as pd >>> df =pd.DataFrame({'col_1':[1,2],'col_2':[0.5,0.75]},index =['row1','row2']) >>> df col_1 col_2 row1 1 0.50 row2 2 0.75

2.1，orient ='dict' — {column(列名) : {index(行名) : value(值) )}}

to_dict('list') 时，构造好的字典形式：{第一列的列名:{第一行的行名：value值，第二行行名，value值}，....}；

>>> df col_1 col_2 row1 1 0.50 row2 2 0.75 >>> df.to_dict('dict') {'col_1': {'row1': 1, 'row2': 2}, 'col_2': {'row1': 0.5, 'row2': 0.75}}

orient = 'dict 可以很方面得到在某一列对应的行名与各值之间的字典数据类型，例如在源数据上面我想得到在col_1这一列行名与各值之间的字典，直接在生成字典查询列名为col_1：

>>> df col_1 col_2 row1 1 0.50 row2 2 0.75 >>> df.to_dict('dict')['col_1'] {'row1': 1, 'row2': 2} 2.2，orient ='list' — {column(列名) :{[ values ](值)}};

生成字典中 key为各列名，value为各列对应值的列表

>>> df col_1 col_2 row1 1 0.50 row2 2 0.75 >>> df.to_dict('list') {'col_1': [1, 2], 'col_2': [0.5, 0.75]}

orient = 'list' 时，可以很方面得到在某一列各值所生成的列表集合，例如我想得到col_2 对应值得列表：

>>> df col_1 col_2 row1 1 0.50 row2 2 0.75 >>> df.to_dict('list')['col_2'] [0.5, 0.75] 2.3，orient ='series' — {column(列名) : Series (values) (值)};

orient ='series' 与 orient = 'list' 唯一区别就是，这里的 value 是 Series数据类型，而前者为列表类型

>>> df col_1 col_2 row1 1 0.50 row2 2 0.75 >>> df.to_dict('series') {'col_1': row1 1 row2 2 Name: col_1, dtype: int64, 'col_2': row1 0.50 row2 0.75 Name: col_2, dtype: float64} 2.4，orient ='split' — {'index' : [index]，‘columns' :[columns]，’data‘ : [values]};

orient ='split' 得到三个键值对，列名、行名、值各一个，value统一都是列表形式；

>>> df col_1 col_2 row1 1 0.50 row2 2 0.75 >>> df.to_dict('split') {'index': ['row1', 'row2'], 'columns': ['col_1', 'col_2'], 'data': [[1, 0.5], [2, 0.75]]}

orient = 'split' 可以很方面得到 DataFrame数据表中全部列名或者行名的列表形式，例如我想得到全部列名：

>>> df col_1 col_2 row1 1 0.50 row2 2 0.75 >>> df.to_dict('split')['columns'] ['col_1', 'col_2'] 2.5，orient ='records' — [{column:value(值)},{column:value}....{column:value}];

注意的是，orient ='records' 返回的数据类型不是 dict ; 而是list 列表形式，由全部列名与每一行的值形成一一对应的映射关系:

>>> df col_1 col_2 row1 1 0.50 row2 2 0.75 >>> df.to_dict('records') [{'col_1': 1, 'col_2': 0.5}, {'col_1': 2, 'col_2': 0.75}]

这个构造方式的好处就是，很容易得到列名与某一行值形成得字典数据；例如我想要第2行{column:value}得数据：

>>> df col_1 col_2 row1 1 0.50 row2 2 0.75 >>> df.to_dict('records')[1] {'col_1': 2, 'col_2': 0.75} 2.6，orient ='index' — {index:{culumn:value}};

orient ='index'与2.1用法刚好相反，求某一行中列名与值之间一一对应关系(查询效果与2.5相似)：

>>> df col_1 col_2 row1 1 0.50 row2 2 0.75 >>> df.to_dict('index') {'row1': {'col_1': 1, 'col_2': 0.5}, 'row2': {'col_1': 2, 'col_2': 0.75}} #查询行名为 row2 列名与值一一对应字典数据类型 >>> df.to_dict('index')['row2'] {'col_1': 2, 'col_2': 0.75}

【本文地址】

公司简介

联系我们