. Advertisement .
..3..
. Advertisement .
..4..
The DataFrame.Query
function is one of the most commonly used functions in Pandas. So what is it? How to use the method for processing data and programming? Follow this article to get the hook for it.
What Is Pandas.DataFrame.Query Function
The DataFrame.query()
function in Pandas is commonly used to query all the rows depending on the provided expressions. These expressions include both single and multiple column conditions.
After that, it will return a new dataframe or allow you to update an existing one. The function shows a query’s expression as a string. The parameter evaluates to True or False depending on this expression.
There are two parameters: expr and inplace. The first one is compulsory, which is a string representing its expression.
The inplace parameter with True or False values is optional. It specifies whether to leave the existing dataframe with the inplace = false
function to return a copy.
For making changes, you should use the inplace = False
command.
Example:
# Query Rows using DataFrame.query()
df2=df.query("Courses == 'Spark'")
#Using variable
value='Spark'
df2=df.query("Courses == @value")
#inpace
df.query("Courses == 'Spark'",inplace=True)
#Not equals, in & multiple conditions
df.query("Courses != 'Spark'")
df.query("Courses in ('Spark','PySpark')")
df.query("`Courses Fee` >= 23000")
df.query("`Courses Fee` >= 23000 and `Courses Fee` <= 24000")
How To Use Pandas.DataFrame.Query Function
Use Dataframe.query
use @ character command to use a specific value in the expression:
# Query Rows by using Python variable
value='Spark'
df2=df.query("Courses == @value")
print(df2)
You can update this existing dataframe with the inplace = True command:
# Replace current existing DataFrame
df.query("Courses == 'Spark'",inplace=True)
print(df)
The query also enables coders to select values not equals with ! =
operator:
# not equals condition
df2=df.query("Courses != 'Spark'")
Output:
Courses Courses Fee Duration Discount
1 PySpark 25000 50days 2300
2 Hadoop 23000 30days 1000
3 Python 24000 None 1200
4 Pandas 26000 NaN 2500
Select Rows With A List of Column Values
If you want to call rows depending on a given Python list of values, let’s use the in operator command. This checks a value in a list of string ones.
# Query Rows by list of values
print(df.query("Courses in ('Spark','PySpark')"))
Output:
Courses Fee Duration Discount
0 Spark 22000 30days 1000
1 PySpark 25000 50days 2300
With rows not belonging to a list of column values, use the not in operator method.
# Query Rows not in list of values
values=['Spark','PySpark']
print(df.query("Courses not in @values"))
Use In Multiple Conditions
In most table-like structures including Pandas, you need to select rows with multiple conditions. To do this, you need to use multiple columns and run the following command:
# Query by multiple conditions
print(df.query("`Courses Fee` >= 23000 and `Courses Fee` <= 24000"))
Output:
Courses Courses Fee Duration Discount
2 Hadoop 23000 30days 1000
3 Python 24000 None 1200
Use Apply()
The dataframe.apply()
method applies the row-by-row expression and returns the compatible values. Here is an example of this function with a specified list of string value:
# By using lambda function
print(df.apply(lambda row: row[df['Courses'].isin(['Spark','PySpark'])]))
Output:
Courses Fee Duration Discount
0 Spark 22000 30days 1000
1 PySpark 25000 50days 2300
Conclusion
This article has explained various examples of using pandas dataframe.query()
function.
Leave a comment