. Advertisement .
..3..
. Advertisement .
..4..
Want to convert column to int in Pandas DataFrame? You will need to learn how to use DataFrame.apply()
and DataFrame.astype(int)
to do so.
If you plan to convert a float, you probably already know that it is larger than the integer type and that doing so would remove any number following the decimal.
Convert Column To Int In Pandas DataFrame Examples
Here are a few brief samples of changing a column’s integer data type in Pandas DataFrame.
# Below are quick examples
# convert "Fee" from String to int
df = df.astype({'Fee':'int'})
# Convert all columns to int dtype.
# This returns error in our DataFrame
#df = df.astype('int')
# Convert single column to int dtype.
df['Fee'] = df['Fee'].astype('int')
# convert "Discount" from Float to int
df = df.astype({'Discount':'int'})
# Converting Multiple columns to int
df = pd.DataFrame(technologies)
df = df.astype({"Fee":"int","Discount":"int"})
# convert "Fee" from float to int and replace NaN values
df['Fee'] = df['Fee'].fillna(0).astype(int)
print(df)
print(df.dtypes)
Make DataFrame with columns and rows, run examples, and check the output. This DataFrame has Courses, Duration, Discount, and Fee.
import pandas as pd
import numpy as np
technologies= {
'Courses':["Spark","PySpark","Hadoop","Python","Pandas"],
'Fee' :["22000","25000","23000","24000","26000"],
'Duration':['30days','50days','35days', '40days','55days'],
'Discount':[1000.10,2300.15,1000.5,1200.22,2500.20]
}
df = pd.DataFrame(technologies)
print(df)
print(df.dtypes)
Output:
Courses Fee Duration Discount
0 Spark 22000 30days 1000.10
1 PySpark 25000 50days 2300.15
2 Hadoop 23000 35days 1000.50
3 Python 24000 40days 1200.22
4 Pandas 26000 55days 2500.20
Courses object
Fee object
Duration object
Discount float64
dtype: object
Keep in mind that the Discount column is float64 type, whereas the Fee column is an object/string containing integer values.
Convert Columns To Integer
Use the DataFrame.astype()
method if you want to convert columns to integers. This can be applied to a single column or the full DataFrame. Utilize numpy.int64
, numpy.int_
, int64
, or int as a parameter to convert a dtype to 64-bit signed int. Or, if you want to convert to 32-bit signed int, utilize int32 or numpy.int32.
Here is an example of converting the column Fee from string to an int64. This method also accepts numpy.int64
as the parameter.
# convert "Fee" from String to int
df = df.astype({'Fee':'int'})
print(df.dtypes)
Output:
Courses object
Fee int64
Duration object
Discount float64
dtype: object
Suppose all of the string columns in DataFrame contain int values; you can easily cast it to integer dtype by doing as follows. Notice that you will receive an error result if you execute this on DataFrame or there is a column with alpha-numeric values.
# Convert all columns to int dtype.
df = df.astype('int')
You also can cast a particular column using Series.astype()
. Since every column in the DataFrame is a pandas Series, you will obtain the column as a Series and utilize the astype()
method. As the following example, df[‘Fee’]
or df.Fee
gives back a Series Python object.
# Convert single column to int dtype.
df['Fee'] = df['Fee'].astype('int')
Convert Float To Integer Dtype
Change the floating column in the pandas DataFrame to the int type using the same methods with astype()
. Keep in mind that no rounding or flooring occurs while casting a float property to an integer. Just the fraction values are truncated.
The example below uses the DataFrame.astype()
function to convert column Discount housing a float value to integer.
# convert "Discount" from Float to int
df = df.astype({'Discount':'int'})
print(df.dtypes)
Output:
Courses object
Fee int64
Duration object
Discount int64
dtype: object
In a similar manner, you may convert all or just one column.
Convert Many Columns To Int
Passing a column name dict -> dtype to the astype()
function allows you to convert several columns to integers. The sample below changes the Fee column from string to integer and the Discount column from float to integer dtype.
# Converting Multiple columns to int
df = pd.DataFrame(technologies)
df = df.astype({"Fee":"int","Discount":"int"})
print(df.dtypes)
Output:
Courses object
Fee int32
Duration object
Discount int32
dtype: object
Utilizing Appy(Np.Int64) For Converting To Integer
To change the Fee column in Pandas from string to int, use the DataFrame.apply()
method. In the following sample, we use numpy.int64
.
import numpy as np
# convert "Fee" from float to int using DataFrame.apply(np.int64)
df["Fee"] = df["Fee"].apply(np.int64)
print(df.dtypes)
Output:
Courses object
Fee int64
Duration object
Discount float64
dtype: object
Convert Column With NaNs To Astype(Int)
Establish a DataFrame with NaN values to illustrate a certain NaN/Null
value. If you want to convert columns containing a mix of NaN and float values to integers, you will need to swap NaN values on the DataFrame with zero and utilize the astype()
function to convert.
import pandas as pd
import numpy as np
technologies= {
'Fee' :[22000.30,25000.40,np.nan,24000.50,26000.10,np.nan]
}
df = pd.DataFrame(technologies)
print(df)
print(df.dtypes)
To convert NaN values to int values equal to zero, apply DataFrame.fillna()
.
# convert "Fee" from float to int and replace NaN values
df['Fee'] = df['Fee'].fillna(0).astype(int)
print(df)
print(df.dtypes)
Output:
Fee
0 22000
1 25000
2 0
3 24000
4 26000
5 0
Fee int32
dtype: object
The Bottom Line
Above is the detailed tutorial on how you can convert column to int in Pandas DataFrame. We also provide several examples to help you better understand how to get things done. Let’s put these techniques to practice. Once you have mastered the methods above, you can continue to learn new skills, such as converting NumPy sets to Pandas DataFrame.
Leave a comment