Pandas: How to Use dropna() with Specific Columns 您所在的位置:网站首页 daytimeuseonly Pandas: How to Use dropna() with Specific Columns

Pandas: How to Use dropna() with Specific Columns

2023-04-01 01:07| 来源: 网络整理| 查看: 265

You can use the dropna() function with the subset argument to drop rows from a pandas DataFrame which contain missing values in specific columns.

Here are the most common ways to use this function in practice:

Method 1: Drop Rows with Missing Values in One Specific Column

df.dropna(subset = ['column1'], inplace=True)

Method 2: Drop Rows with Missing Values in One of Several Specific Columns

df.dropna(subset = ['column1', 'column2', 'column3'], inplace=True)

The following examples show how to use each method in practice with the following pandas DataFrame:

import pandas as pd import numpy as np #create DataFrame df = pd.DataFrame({'team': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'], 'points': [18, np.nan, 19, 14, 14, 11, 20, 28], 'assists': [5, np.nan, np.nan, 9, 12, 9, 9, 4], 'rebounds': [11, 8, 10, 6, 6, 5, 9, np.nan]}) #view DataFrame print(df) team points assists rebounds 0 A 18.0 5.0 11.0 1 B NaN NaN 8.0 2 C 19.0 NaN 10.0 3 D 14.0 9.0 6.0 4 E 14.0 12.0 6.0 5 F 11.0 9.0 5.0 6 G 20.0 9.0 9.0 7 H 28.0 4.0 NaN Example 1: Drop Rows with Missing Values in One Specific Column

We can use the following syntax to drop rows with missing values in the ‘assists’ column:

#drop rows with missing values in 'assists' column df.dropna(subset = ['assists'], inplace=True) #view updated DataFrame print(df) team points assists rebounds 0 A 18.0 5.0 11.0 3 D 14.0 9.0 6.0 4 E 14.0 12.0 6.0 5 F 11.0 9.0 5.0 6 G 20.0 9.0 9.0 7 H 28.0 4.0 NaN

Notice that the two rows with missing values in the ‘assists’ column have both been removed from the DataFrame.

Also note that the last row in the DataFrame is kept even though it has a missing value because the missing value is not located in the ‘assists’ column.

Example 2: Drop Rows with Missing Values in One of Several Specific Columns

We can use the following syntax to drop rows with missing values in the ‘points’ or ‘rebounds’ columns:

#drop rows with missing values in 'points' or 'rebounds' column df.dropna(subset = ['points', 'rebounds'], inplace=True) #view updated DataFrame print(df) team points assists rebounds 0 A 18.0 5.0 11.0 2 C 19.0 NaN 10.0 3 D 14.0 9.0 6.0 4 E 14.0 12.0 6.0 5 F 11.0 9.0 5.0 6 G 20.0 9.0 9.0

Notice that the two rows with missing values in the ‘points’ or ‘rebounds’ columns have been removed from the DataFrame.

Note: You can find the complete documentation for the pandas dropna() function here.

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

Pandas: How to Reset Index After Using dropna() Pandas: How to Drop Columns with NaN Values Pandas: How to Drop Rows Based on Multiple Conditions



【本文地址】

公司简介

联系我们

今日新闻

    推荐新闻

    专题文章
      CopyRight 2018-2019 实验室设备网 版权所有