. Advertisement .
..3..
. Advertisement .
..4..
In Pandas there is a case where Missing Data is quite similar to in NumPy. Pandas chooses two null values available in Python, NaN and None, to represent missing data in the data sets it processes, each choice will basically have a number of different benefits and limitations. In this article, we will learn “How to Replace NaN Values with Zeros in Pandas DataFrame“. Let’s get started!
What is NaN in Pandas Dataframe?
NaN stands for “not a number” and can be interpreted to mean “missing”. Unlike None, NaN is a special floating-point value recognized by all systems that use the IEEE standard, which means it is essentially no different from a real number. When working with NaN, you need to pay attention that everything when adding, subtracting, multiplying, dividing, … or performing any math operation with NaN will also return NaN.
For example
Input:
p = pd.Series([5, 6, 7, np.nan, 9, None])
print(p)
Output:
0 5.0
1 6.0
2 7.0
3 NaN
4 9.0
5 NaN
dtype: float64
How to Replace NaN Values with Zeros in Pandas DataFrame?
To Replace NaN Values with Zeros in Pandas DataFrame, we will use 2 main tools, which are Datafame.fillna() and DataFrame.replace(). Specifically as follows.
Method 1: Datafame.fillna()
Datafame.fillna()
is used to replace NaN/None
with any value.
Syntax:
DataFrame.fillna(value=None,method=None,axis=None,inplace=False, limit=None, downcast=None)
Method 2: DataFrame.replace()
DataFrame.replace()
find and replace. It looks for NaN
values and returns a specific value.
Syntax:
DataFrame.replace(to_replace=None,value=None,inplace=False,limit=None,regex=False,method='pad' )
And to make it easy for you to visualize and manipulate, below we will give examples for some specific cases.
Replace NaN values with zeros for a column using fillna()
Here is the data when running the python code that returned NaN:
Input:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Price': [60, np.nan, 40, 30]})
print (df)
Output:
Price
0 50.0
1 NaN
2 40.0
3 30.0
In this case, to replace NaN with 0 we can use the following syntax:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Price': [60, np.nan, 40, 30]})
df['values'] = df['Price'].fillna(0)
print (df)
Output:
Price
0 60.0
1 0.0
2 40.0
3 30.0
Replace NaN values with zeros for a column using replace()
Similarly, you can also use the syntax of replace()
. Like the following:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Price': [60, np.nan, 40, 30]})
df['values'] = df['Price'].replace(np.nan, 0)
print (df)
Output:
Price
0 60.0
1 0.0
2 40.0
3 30.0
Now the value NaN has returned 0
Replace NaN values with zeros for an entire DataFrame using fillna()
For the case where your data has multiple columns return NaN, like this:
Input:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Price_1': [60, np.nan, 40, 30],
'Price_2': [np.nan, 90, np.nan, np.nan]
})
print (df)
Output:
Price_1 Price_2
0 60.0 NaN
1 NaN 90.0
2 40.0 NaN
3 30.0 NaN
In this case, the syntax you use would be:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Price_1': [60, np.nan, 40, 30],
'Price_2': [np.nan, 90, np.nan, np.nan]
})
df = df.fillna(0)
print (df)
Output:
Price_1 Price_2
0 60.0 0.0
1 0.0 90.0
2 40.0 0.0
3 30.0 0.0
Replace NaN values with zeros for an entire DataFrame using replace()
Likewise, you can also completely use the replace()
method in this situation:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Price_1': [60, np.nan, 40, 30],
'Price_2': [np.nan, 90, np.nan, np.nan]
})
df= df.replace(np.nan,0)
print (df)
Output:
Price_1 Price_2
0 60.0 0.0
1 0.0 90.0
2 40.0 0.0
3 30.0 0.0
Thus, we have detailed how to replace NaN for each case.
Conclusion
Above is a summary of the content to pay attention to for the “Replace NaN Values with Zeros in Pandas DataFrame” problem. If you have any questions or concerns, please feel free to leave a comment. We are always excited when our posts can provide useful information. Thanks for reading!
Leave a comment