1. HowTo
  2. Python Pandas Howtos
  3. Pandas Drop Rows With NaN

Pandas Drop Rows With NaN

Created: January-16, 2021 | Updated: November-26, 2021

  1. Pandas Drop Rows With NaN Using the DataFrame.notna() Method
  2. Pandas Drop Rows Only With NaN Values for All Columns Using DataFrame.dropna() Method
  3. Pandas Drop Rows Only With NaN Values for a Particular Column Using DataFrame.dropna() Method
  4. Pandas Drop Rows With NaN Values for Any Column Using DataFrame.dropna() Method

This tutorial explains how we can drop all the rows with NaN values using DataFrame.notna() and DataFrame.dropna() methods.

We will use the DataFrame in the example code below.

              import pandas as pd  roll_no = [501, 502, 503, 504, 505]  data = pd.DataFrame({     'Name': ['Alice', 'Steven', 'Neesham', 'Chris', 'Alice'],     'Age':  [19, None, 18, 21, None],     'Income($)': [4000, 5000, None, 3500, None],     'Expense($)': [3000, 2000, 2500, 25000, None]  })  print(data)                          

Output:

                              Name   Age  Income($)  Expense($) 0    Alice  19.0     4000.0      3000.0 1   Steven   NaN     5000.0      2000.0 2  Neesham  18.0        NaN      2500.0 3    Chris  21.0     3500.0     25000.0 4    Alice   NaN        NaN         NaN                          

Pandas Drop Rows With NaN Using the DataFrame.notna() Method

The DataFrame.notna() method returns a boolean object with the same number of rows and columns as the caller DataFrame. If an element is not NaN, it gets mapped to the True value in the boolean object, and if an element is a NaN, it gets mapped to the False value.

              import pandas as pd  roll_no = [501, 502, 503, 504, 505]  data = pd.DataFrame({     'Name': ['Alice', 'Steven', 'Neesham', 'Chris', 'Alice'],     'Age':  [19, None, 18, 21, None],     'Income($)': [4000, 5000, None, 3500, None],     'Expense($)': [3000, 2000, 2500, 25000, None]  }) print("Initial DataFrame:") print(data)  print("")  data = data[data['Income($)'].notna()] print("DataFrame after removing rows with NaN value in Income Field:") print(data)                          

Output:

              Initial DataFrame:       Name   Age  Income($)  Expense($) 0    Alice  19.0     4000.0      3000.0 1   Steven   NaN     5000.0      2000.0 2  Neesham  18.0        NaN      2500.0 3    Chris  21.0     3500.0     25000.0 4    Alice   NaN        NaN         NaN  DataFrame after removing rows with NaN value in Income Field:      Name   Age  Income($)  Expense($) 0   Alice  19.0     4000.0      3000.0 1  Steven   NaN     5000.0      2000.0 3   Chris  21.0     3500.0     25000.0                          

Here, we apply the notna() method to the column Income($), which returns a series object with True or False values depending upon the column's values. When we pass the boolean object as an index to the original DataFrame, we only get rows without NaN values for the Income($) column.

Pandas Drop Rows Only With NaN Values for All Columns Using DataFrame.dropna() Method

              import pandas as pd  roll_no = [501, 502, 503, 504, 505]  data = pd.DataFrame({     'Id': [621, 645, 210, 345, None],     'Age':  [19, None, 18, 21, None],     'Income($)': [4000, 5000, None, 3500, None],     'Expense($)': [3000, 2000, 2500, 25000, None]  }) print("Initial DataFrame:") print(data)  print("")  data = data.dropna(how='all') print("DataFrame after removing rows with NaN value in All Columns:") print(data)                          

Output:

              Initial DataFrame:       Id   Age  Income($)  Expense($) 0  621.0  19.0     4000.0      3000.0 1  645.0   NaN     5000.0      2000.0 2  210.0  18.0        NaN      2500.0 3  345.0  21.0     3500.0     25000.0 4    NaN   NaN        NaN         NaN  DataFrame after removing rows with NaN value in All Columns:       Id   Age  Income($)  Expense($) 0  621.0  19.0     4000.0      3000.0 1  645.0   NaN     5000.0      2000.0 2  210.0  18.0        NaN      2500.0 3  345.0  21.0     3500.0     25000.0                          

It removes only the rows with NaN values for all fields in the DataFrame. We set how='all' in the dropna() method to let the method drop row only if all column values for the row is NaN.

Pandas Drop Rows Only With NaN Values for a Particular Column Using DataFrame.dropna() Method

              import pandas as pd  roll_no = [501, 502, 503, 504, 505]  data = pd.DataFrame({     'Id': [621, 645, 210, 345, None],     'Age':  [19, None, 18, 21, None],     'Income($)': [4000, 5000, None, 3500, None],     'Expense($)': [3000, 2000, 2500, 25000, None]  }) print("Initial DataFrame:") print(data)  print("")  data = data.dropna(subset=["Id"]) print("DataFrame after removing rows with NaN value in Id Column:") print(data)                          

Output:

              Initial DataFrame:       Id   Age  Income($)  Expense($) 0  621.0  19.0     4000.0      3000.0 1  645.0   NaN     5000.0      2000.0 2  210.0  18.0        NaN      2500.0 3  345.0  21.0     3500.0     25000.0 4    NaN   NaN        NaN         NaN  DataFrame after removing rows with NaN value in Id Column:       Id   Age  Income($)  Expense($) 0  621.0  19.0     4000.0      3000.0 1  645.0   NaN     5000.0      2000.0 2  210.0  18.0        NaN      2500.0 3  345.0  21.0     3500.0     25000.0                          

It drops all the columns in the DataFrame, which have NaN value only in the Id Column.

Pandas Drop Rows With NaN Values for Any Column Using DataFrame.dropna() Method

              import pandas as pd  roll_no = [501, 502, 503, 504, 505]  data = pd.DataFrame({     'Id': [621, 645, 210, 345, None],     'Age':  [19, None, 18, 21, None],     'Income($)': [4000, 5000, None, 3500, None],     'Expense($)': [3000, 2000, 2500, 25000, None]  }) print("Initial DataFrame:") print(data)  print("")  data = data.dropna() print("DataFrame after removing rows with NaN value in any column:") print(data)                          

Output:

              Initial DataFrame:       Id   Age  Income($)  Expense($) 0  621.0  19.0     4000.0      3000.0 1  645.0   NaN     5000.0      2000.0 2  210.0  18.0        NaN      2500.0 3  345.0  21.0     3500.0     25000.0 4    NaN   NaN        NaN         NaN  DataFrame after removing rows with NaN value in any column:       Id   Age  Income($)  Expense($) 0  621.0  19.0     4000.0      3000.0 3  345.0  21.0     3500.0     25000.0                          

By default, the dropna() method will remove all the row which have at least one NaN value.

Contribute

DelftStack is a collective effort contributed by software geeks like you. If you like the article and would like to contribute to DelftStack by writing paid articles, you can check the write for us page.

Related Article - Pandas DataFrame Row

  • Get the Row Count of a Pandas DataFrame

    Related Article - Pandas NaN

  • Pandas Drop Rows
  • Pandas Drop Duplicate Rows
  • Ezoic