Remove DataFrame rows with missing values in Python

David Y.
jump to solution

The Problem

In Pandas, how do I remove DataFrame rows that contain None or NaN across all columns? How can I do this when these values are present in only some columns?

The Solution

We can achieve both of these results using the DataFrame.dropna method. For example:

import pandas
from numpy import nan

df = pandas.DataFrame(
    {
        "Test 1": [90, 10, nan, nan],
        "Test 2": [41, nan, 32, nan],
        "Test 3": [89, 35, 72, nan],
        "Test 4": [52, nan, nan, nan],
    }
)
print(df)

# output:
#    Test 1  Test 2  Test 3  Test 4
# 0    90.0    41.0    89.0    52.0
# 1    10.0     NaN    35.0     NaN
# 2     NaN    32.0    72.0     NaN
# 3     NaN     NaN     NaN     NaN

df_no_empty_rows = df.dropna(how="all")  # drop rows containing all NaNs
print(df_no_empty_rows)

# output:
#    Test 1  Test 2  Test 3  Test 4
# 0    90.0    41.0    89.0    52.0
# 1    10.0     NaN    35.0     NaN
# 2     NaN    32.0    72.0     NaN

df_no_empty_values = df.dropna(how="any")  # drop rows containing any NaNs
print(df_no_empty_values)

# output:
#    Test 1  Test 2  Test 3  Test 4
# 0    90.0    41.0    89.0    52.0

Considered "not bad" by 4 million developers and more than 150,000 organizations worldwide, Sentry provides code-level observability to many of the world's best-known companies like Disney, Peloton, Cloudflare, Eventbrite, Slack, Supercell, and Rockstar Games. Each month we process billions of exceptions from the most popular products on the internet.

Sentry