In this article, we will discuss how to check for NaN values in pandas data frames.

In pandas, it is essential to identify and handle NaN values in data frames to ensure accurate and reliable analysis.

Checking for NaN values

We can use the isna() method to check for NaN values in pandas dataframe. This method returns a Boolean mask that indicates where the NaN values are located. It returns the True for NaN values and False for non-NaN values.

See the below example:

import pandas as pd

# create a pandas dataframe
df = pd.DataFrame({'col1': [1, 2, 3, 4, 5], 'col2': [6, 7, None, 9, 10]})

# check for NaN values
print(df.isna())

In the above example, we created a pandas dataframe with two columns. We then used the isna() method to check for NaN values in the dataframe.

Output

    col1   col2
0  False  False
1  False  False
2  False   True
3  False  False
4  False  False

The above output shows a Boolean mask indicating that the third row of the col2 column contains a NaN value.

As an alternative way, we can also use the isnull() method, which is an alias for the isna() method.

import pandas as pd

# create a pandas dataframe
df = pd.DataFrame({'col1': [1, 2, 3, 4, 5], 'col2': [6, 7, None, 9, 10]})

# check for NaN values
print(df.isnull())

Output:

    col1   col2
0  False  False
1  False  False
2  False   True
3  False  False
4  False  False

Handling NaN values

Once we have identified the NaN values in a pandas dataframe, we can handle them in different ways. One common approach is to fill the NaN values with a specific value, such as a mean, median, or mode value.

For example, you can use the fillna() method to fill the NaN values in a column with the mean value of the column.

import pandas as pd

# create a pandas dataframe
df = pd.DataFrame({'col1': [1, 2, 3, 4, 5], 'col2': [6, 7, None, 9, 10]})

# fill NaN values with the mean of the column
df['col2'].fillna(df['col2'].mean(), inplace=True)

# print the dataframe
print(df)

Output:

   col1  col2
0     1   6.0
1     2   7.0
2     3   8.0
3     4   9.0
4     5  10.0

In the above example, we used the fillna() method to fill the NaN values in the col2 column with the mean value of the column.

The original dataframe get modified by using the parameter inplace=True.

Categorized in:

Tagged in: