Remove DataFrame rows with missing values in Python
The Problem
In Pandas, how do I remove DataFrame rows that contain None or NaN across all columns? How can I do this when these values are present in only some columns?
The Solution
We can achieve both of these results using the DataFrame.dropna method. For example:
import pandas
from numpy import nan
df = pandas.DataFrame(
{
"Test 1": [90, 10, nan, nan],
"Test 2": [41, nan, 32, nan],
"Test 3": [89, 35, 72, nan],
"Test 4": [52, nan, nan, nan],
}
)
print(df)
# output:
# Test 1 Test 2 Test 3 Test 4
# 0 90.0 41.0 89.0 52.0
# 1 10.0 NaN 35.0 NaN
# 2 NaN 32.0 72.0 NaN
# 3 NaN NaN NaN NaN
df_no_empty_rows = df.dropna(how="all") # drop rows containing all NaNs
print(df_no_empty_rows)
# output:
# Test 1 Test 2 Test 3 Test 4
# 0 90.0 41.0 89.0 52.0
# 1 10.0 NaN 35.0 NaN
# 2 NaN 32.0 72.0 NaN
df_no_empty_values = df.dropna(how="any") # drop rows containing any NaNs
print(df_no_empty_values)
# output:
# Test 1 Test 2 Test 3 Test 4
# 0 90.0 41.0 89.0 52.0
Considered "not bad" by 4 million developers and more than 150,000 organizations worldwide, Sentry provides code-level observability to many of the world's best-known companies like Disney, Peloton, Cloudflare, Eventbrite, Slack, Supercell, and Rockstar Games. Each month we process billions of exceptions from the most popular products on the internet.