. Advertisement .
..3..
. Advertisement .
..4..
What is the pandas drop columns method? By default, pandas.DataFrame.drop() deletes one or many columns in the DataFrame.
Instead of deleting the current DataFrame, this method brings you a different DataFrame after removing the columns supplied by the drop function. Use the inplace=True parameter to delete columns from the current DataFrame object.
pandas Drop Column Brief Examples
Here are a few brief examples of removing a column by name, index, etc.
# Drop single column by Name
df2=df.drop(["Fee"], axis = 1)
df2=df.drop(columns=["Fee"], axis = 1)
df2=df.drop(labels=["Fee"], axis = 1)
# Drop single column by Index
df2=df.drop(df.columns[1], axis = 1)
#Updates the DataFrame in place
df.drop(df.columns[1], axis = 1, inplace=True)
# Drop multiple columns
df.drop(["Courses", "Fee"], axis = 1, inplace=True)
df.drop(df.columns[[1,2]], axis = 1, inplace=True)
# Other ways to drop columns
df.loc[:, 'Courses':'Fee'].columns, axis = 1, inplace=True)
df.drop(df.iloc[:, 1:2], axis=1, inplace=True)
Syntax of pandas.DataFrame.drop()
Here is the method’s syntax.
# pandas DataFrame drop() Syntax
DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')
- labels: List-like or single label.
- axis: utilize 0 to delete rows and 1 to delete columns.
- index: index of the column to remove.
- columns: List-like or single label.
- level: level name or int. Apply to Multiindex.
- inplace: False as default. Give back a DataFrame copy. When set to True, it drops the column and produces a None response.
- errors: “raise” or “ignore”. Default as “raise”.
Here is a thorough example. On the DataFrame, you have Duration, Course, and Fee columns.
import pandas as pd
technologies = ({
'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
'Fee' :[20000,25000,26000,22000,24000,21000,22000],
'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
})
df = pd.DataFrame(technologies)
print(df)
Output:
Courses Fee Duration
0 Spark 20000 30day
1 PySpark 25000 40days
2 Hadoop 26000 35days
3 Python 22000 40days
4 pandas 24000 60days
5 Oracle 21000 50days
6 Java 22000 55days
pandas Drop Columns
The drop method deletes columns by index and name by default from a DataFrame. Instead of deleting the current DataFrame, this method brings you a brand-new DataFrame that lacks the columns listed by the drop approach.
Use the inplace=True parameter to delete columns from the current DataFrame object. If the DataFrame doesn’t include the column you wished to remove, it gives back an error value, and this mistake can be managed with the errors parameter.
Additionally, you can remove the DataFrame’s index using the index parameter.
Remove Columns By Name
In the example below, we delete the column Fee by name from the DataFrame. Notice that to utilize axis = 1 to remove columns.
# Drops 'Fee' column
df2=df.drop(["Fee"], axis = 1)
print(df2)
# Explicitly using parameter name 'labels'
df2=df.drop(labels=["Fee"], axis = 1)
# Alternatively you can also use columns instead of labels.
df2=df.drop(columns=["Fee"], axis = 1)
Output:
Courses Duration
0 Spark 30day
1 PySpark 40days
2 Hadoop 35days
3 Python 40days
4 pandas 60days
5 Oracle 50days
6 Java 55days
For self DataFrame updates, utilize inplace=True.
Remove Columns By Index
To delete a DataFrame column by Index, you must obtain the columns in the form of a list using the df.columns function then select column by index. Remember that in Python, the index begins at 0.
The following example has df.columns[1] indicating the Fee column, which is the DataFrame’s second column.
Output:
# Drop column by index.
print(df.drop(df.columns[[1]], axis = 1))
# using inplace=True
#df.drop(df.columns[[1]], axis = 1, inplace=True)
#print(df)
DataFrame’s Drop Many Columns
Remove More Than Two Columns By Label Name
When there is a column name list to remove, build a list with the names of columns. Then, utilize it with the drop() approach or the list directly. Here is a tutorial on turning a pandas DataFrame into a list.
The example below removes the Fee and Courses columns.
df2=df.drop(["Courses", "Fee"], axis = 1)
print(df2)
Output:
Duration
0 30day
1 40days
2 35days
3 40days
4 60days
5 50days
6 55days
For self DataFrame updates, utilize inplace=True.
Remove More Than Two Columns By Index
Unfortunately, drop() doesn’t accept an index as a parameter if you intend to drop more than two columns by index. Still, we can get around the issue by utilizing df.columns[] to get the names of columns by index.
Output:
df2=df.drop(df.columns[[0,1]], axis = 1)
print(df2)
List Of Columns’ Drop Columns
Use the method listed below if you want to remove every single one on the given list.
lisCol = ["Courses","Fee"]
df2=df.drop(lisCol, axis = 1)
print(df2)
Other Method To Delete Columns From A DataFrame
Remove Columns From inplace
Utilize inplace=True if you wish to remove a column from its current location. This function gives back None when used with drop(). Here’s an example.
df.drop(df.columns[1], axis = 1, inplace=True)
Delete Columns In Column List By Condition
Below is an example that iteratively does the same thing. This code eliminates the column Fee.
for col in df.columns:
if 'Fee' in col:
del df[col]
print(df)
Delete Columns Between Specified Columns Using df.loc()
loc[] function deletes every column between a particular column name and another. Utilize [ : , ‘Courses’:’Fee’] option to remove the two columns. inplace would function with the initial object.
df.drop(df.loc[:, 'Courses':'Fee'].columns, axis = 1, inplace=True)
print(df)
Delete Columns Between Specified Columns Indexes Using df.iloc()
iloc[] eliminates any column between a particular column and another. To eliminate the second one, enter [:, 1:2].
df.drop(df.iloc[:, 1:2], inplace=True, axis=1)
print(df)
Detailed Example
import pandas as pd
technologies = ({
'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
'Fee' :[20000,25000,26000,22000,24000,21000,22000],
'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
})
df = pd.DataFrame(technologies)
print(df)
# Drop single column by Name
df2=df.drop(["Fee"], axis = 1)
print(df2)
df2=df.drop(columns=["Fee"], axis = 1)
print(df2)
df2=df.drop(labels=["Fee"], axis = 1)
print(df2)
# Drop column by index
df2=df.drop(df.columns[1], axis = 1)
print(df2)
# Drop multiple columns by Name
df2=df.drop(["Courses", "Fee"], axis = 1)
print(df2)
# Drop multiple columns by Index
df2=df.drop(df.columns[[0,1]], axis = 1)
print(df2)
# Drop Columns from List
lisCol = ["Courses","Fee"]
df2=df.drop(lisCol, axis = 1)
print(df2)
# Drop columns between two columns
df2=df.drop(df.loc[:, 'Courses':'Fee'].columns, axis = 1)
print(df)
df.drop(df.iloc[:, 1:2], inplace=True, axis=1)
print(df)
# Drop columns by condition
for col in df.columns:
if 'Fee' in col:
del df[col]
print(df)
Conclusion
This guide has explained how to use the pandas drop columns method to eliminate a column or more by name, index, and labels. You also have many examples to use as references. Now you can begin putting these functions to use.
Leave a comment